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(57) Abstract: The present invention relates to methods and compositions for assessing the quality of microarrays. In particular, the 
invention relates to the use of quality control probes that are synthesized on the microarray monomer by monomer in a step-by-step 
synthesis. By assessing the degree of signal from the quality control probes and determining their deviation &x)m expected signal 
intensities, the quality of microarray synthesis can be ascertained. The invention further relates to a method of detecting defects 
occurring during storage or processing of the microarray. The invention further relates to a method of using a computer to identify 
microarrays that have had a defect or defects during synthesis, storage, or processing. 
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METHODS TO ASSESS QUALITY OF MICROARRAYS 

This ^plication claims priority to U.S. Provisional Application Serial No. 
60/392,629, filed June 28, 2002, which is incorporated herein by reference in its entirety. 



1, FIELD OF THE INVENTION 

5 Hie present invention relates to methods and compositions for assessing the quality 

of microarray synthesis. The invention further relates to a method of detecting defects 
occurring during storage or processing of tiae miaroarray. In particular, the invention relates 
to the use of quality control probes that are synthesized on the microarray for assessing 
microarray quality. The invention further relates to a method of using a computer to 

10 identify microarrays that have a defect or defects, e.g., arising during synthesis, storage, or 
processing. 

2. BACKGROUND OF THF. INVENTION 

DNA array technologies have made it possible, inter alia, to monitor the expression 
levels of a large number of genetic transcripts at any one time (see, e.g., Schena et al., 1995, 

15 Science 270:467-470; Lockhart et al., 1996, Nature BioTechnology 14:1675-1680; 
Blanchard et al., 1996, Nature BioTechnology 14:1649; Shoemaker et al., U.S. Patent 
Apphcation Serial No. 09/724,538, filed on November 28, 2000). DNA array technologies 
have also found applications in gene discovery, e.g., in identification of exon structures of 
genes (see, e.g.. Shoemaker et al., U.S. Patent Application Serial No. 09/724,538, filed on 

20 November 28, 2000; Meltzer, 2001, Curr, Opin. Genet. Dev. 1 1(3):258-63; Andrews et aL, 
2000, Genome Res. 10(12):2030-43; Abdellatifi 2000, Circ. Res. 86(9):919-20; Lemion, 
2000, DrugDiscov. Today 5(2):59-66; Zweiger, 1999, Hrends BiotechnoL 17(ll):429-36). 

By simultaneously monitoring tens of thousands of genes, microarray technologies 
have allowed, inter alia, genome-wide analysis of mRNA expression in a cell or a cell type 

25 or any biological sample. Aided by sophisticated data managCTient and analysis 

methodologies, the transcriptional state of a cell or cell type as well as changes of the 
transcriptional state in response to external perturbations, including but not limited to drug 
perturbations, can be characterized on the mRNA level (see, e.g., U.S. Patent No. 
6,203,987; Stoughton et al.. International Publication No. WO 00/24936 (published May 4, 

30 2000); Stoughton et al.. International PubUcation No. WO 00/39336 (published July 6, 

2000); Friend et al.. International Publication No. WO 00/24936 (published May 4, 2000)), 
Applications of such technologies include, for example, identification of genes which are up 
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regulated or down regulated in various physiological states, particularly diseased states. 
Additional exemplary uses for DNA arrays include the analyses of members of signaling 
pathways, and the identification of targets for varioixs drugs. See, e.g.. Friend and Hartwell, 
International Publication No. WO 98/38329 (published September 3, 1998); Friend and 

5 Stoughton, International Publication No, WO 99/59037 (published November 18, 1999); 
U.S. Patent Nos. 6,132,969; 5,965,352; 6,218,122. 

A microarray is an array of positionally-addressable binding {e.g., through 
hybridization) sites on a support. Each of such binding sites comprises a plurality of 
biopolymer molecules of a probe bound to the a predetermined region on the support. 

10 Microarrays can be fabricated in a number of ways, including immobilization of pre- 
synthesized probes on the support or the in situ synthesis of probes on the support. For 
example, immobiUzation of pre-synthesized probes can be done robotically as described in 
DeRisi et al, (1997, Science 278(5338):680-6) or by Inkjet. In situ synthesis can be 
accompUshed by different means, including using Inkjet technology or by Ught-activated 

15 synthesis (Hohnes et al., 1995, Biopolymers 37(3): 199-21 1; Jacobs et al., 1994, Trends 

Biotechnol 12(l):19-26; Fodor et al., 1991, Science 251(4995):767-73). In either case of in 
situ synthesis, chemical reactions take place on the support in which a monomer or 
monomers are added to the biopolymer. As the biopolymer chain grows, however, there is 
a chance that one or more of the synfliesis cycles may fail (either fuUy or partially) thereby 

20 producing a probe that lacks one or more of the intended monomers. Synthesis efficiency 
depends on multiple factors including reagent purity, reaction time, correct alignment of the 
inkjet head, etc. Defects in any of these processes can result inefficient addition of a 
monomer or monomers to the growing biopolymer chain. 

In addition, in the case of an inkjet-synthesized microarray, a synthesis defect may 

25 also occur when one of the nozzles of the inkjet head fails to deliver a reagent properly 

ie. g., if the nozzle becomes temporarily or permanently obstructed). A nozzle failxire refers 
to any malfunction of an individual ink jet nozzle. If a nozzle fails to deliver the desired 
solution required for biopolymer addition, it is sometimes referred to as being "clogged." A 
nozzle failure can occur at any point during microarray synthesis. A fidlure at the 

30 beginning of the synthesis may be due to insufficient priming of new reagents through the 
nozzles. A nozzle failure can also occur after the printing of a set of microarrajrs has begun 

if, e.g., there are tr^ed air bubbles or particulates. Nozzle failures can be detected and 
corrected before a microarray is synthesized. Before the start of each synthesis batch and at 
the end of each synthesis batch every nozzle on the printhead can be tested to make sure 
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that it is properly fimctiomng. This can be done by placing a clean substrate on top of the 
head assembly before forcing each nozzle to extrude a small amount of liquid If all nozzles 
are working properly, there will be a drop of liquid corresponding to each nozzle. If, 
however, one or more nozzles is malfunctioning, the drop corresponding to that nozzle 
5 position is missing. Because of the small size of the drops, a nozzle feilure can be 

overlooked occasionally due to human error, and an array will be synthesized that shows 
evidence of a nozzle failure. Currently there exists a need for a more reliable method to 
determine if synthesis failures have occurred and, if so, where and when fliey happened 
during the course of microarray synthesis. Whereas it is possible to perform quality control 

10 on pre-synthesized probes by conventional DNA sequencing, by mass spectroscopy, or by 
other means, methods to assess the quality of probes synthesized in situ are lacking. 

This appUcation describes a method designed to assess the quality of microarray 
synthesis. The herein disclosed invention describes methods for the design and production 
of quality control probes on the microarray and methods for analysis of tiie information 

15 obtained from microarray processing that permit the determination of the overall quality of 
synthesis as well as the identity of the synthesis cycle most likely to have been defective. 
This invention also includes a database that contains information concerning the position 
and identity of the quality control probes on the microarray. 

Citation or discussion of a reference herein shall not be construed as an admission 

20 that such is prior art to the present invention. 

3. SUMMARY OF THE INVENTION 

The present invention relates to methods and compositions to assess the quality of 
microarrays where the biopolymer probes are synthesized on the array substrate monomer 
by monomer in a step-by-step synthesis. In particular, failures or inefficiencies in the 

25 deposition of individual synthesis cycles of the microarray are detected through the 
inclusion of quality control probes on the microarray. The quality control probes are 
synthesized onto the microarray concurrently with the other biopolymer probes and thus 
would also be subject to any synthesis failures or inefficiencies that may occur. By 
assessing the degree of signal from the quality control probes and determining their 

30 deviation from expected signal intensities, the quaUty of microarray synthesis can be 
ascertained. 

In one embodiment, each group of quality control probes comprises the same 
predetermined binding sequence for which a binding partner exists in or is introduced into 
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the sample to be contacted with the microarray for analysis. The synthesis of the 
predetennined binding sequence in each quality control probe is initiated during the step- 
by-step synthesis at sequmtial cycles of synthesis. By assessing the degree of binding of a 
biopolymer capable of binding to the predetermined binding sequence of the quality control 
5 probe, the quality of microarray synthesis can be determined. Li another embodiment, the 
quality control probes do not comprise a predetermmed binding sequence. A detectable 
signal is generated by the quality control probe itself rather than a labeled binding partner 
binding to the predetermined binding sequence. This can be accomplished by, e.g., 
incorporation of one or more labeled monomers into the quality control probe, staining of 

10 the quality control probe with a fluorescent dye, etc. 

In a preferred embodiment, the invention relates to methods of detecting synthesis 
failures on a oligonucleotide microarray. In a more preferred embodiment, the invention 
relates to metiiods of detecting synthesis defects including nozzle failures during the 
synthesis of an ink jet oligonucleotide microarray. In addition to synthesis failures, other 

1 S defects that affect microarray quality can also be detected, e.g., those due to degradation of 
probes during storage or processing of the microarray. 

The invention provides a positionally addressable array comprising a substrate to 
which are attached a plurality of different biopolymer probes, said different biopolymer 
probes in said plurality being situated at different positions on said surface and being the 

20 product of a step-by-step synthesis of said biopolymer probes on said substrate, said 

plurality of different binding probes comprising a plurality of quality control probes, the 
synthesis of said quality control probe having been initiated dming said step-by-step 
synthesis at sequential cycles of synthesis. Each quality control probe in said plurality 
comprising a predetermined binding sequence preferably comprises the same predetermined 

25 binding sequence or alternatively a different predetermined binding sequence but with the 
same binding specificity or similar binding characteristics {e,g., bind to their respective 
binding partner with similar intensities under the same binding conditions). In one 
embodiment, predetermined binding sequences of different lengths can be used {e.g., a 
25mer and a 24mer). 

30 In one specific embodiment of the array, the sequence of each said quality control 

probe of said plurality consists of said predetermined binding sequence. 

In another specific embodiment, tibie plurality of quality control probes comprise a 
second sequence consisting of a chemical stmcture contiguous with said predetCTnined 
binding sequence, wherein at least some of tfie quality control probes differ from other of 
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the quality control probes in length of said chemical structure. In a specific embodiment, 
the chemical structure is a sequence of number 0 to N monomers contiguous with said 
predetemiined binding sequence, and where N is a whole number equal to or greater than 1 . 
In a specific embodiment, the biopolymer probes are oligonucleotides, said predetermined 

5 sequence consists of 25 nucleotides, and said biopolymer probes that are not said quality 
control probes consist of 60 nucleotides. In a specific embodiment, N is not greater than the 
number of monomers in said biopolymer probes on the array that are not said quality 
control biopolymer probes minus the number of monomers in said predetermined binding 
sequence. In another specific embodiment, the quality control probes comprise a greater 

10 nxmiber of monomers than biopolymer probes on the array that are not said quality control 
biopolymer probes. In a further specific embodiment, an array comprises 3, 10, 30, 60 or 
more of said quality control probes that differ in N. A particular embodiment is wherein N 
is 0, 20, and 35, respectively, for different quality control probes. 

In yet another specific embodiment, the plurality of quality control probes comprise 

1 s (i) quality control probes whose sequence consists of said predetermined 

sequence; and 

(ii) quality control probes that comprise a second sequence of number 0 to N 
monomers contiguous with said predetermined binding sequence, wherein at least some of 
said quality control probes differ &om other of said quaUty control probes in the number of 
20 said monomers, and where N is a whole number equal to or greater than 1 . 

In various specific embodiments, the biopolymer probes are nucleic acids, proteins, 
or antibodies. Preferably the predetermined bmding sequence is in the range of 10-40 
nucleotides in length, and more preferably, is 25 nucleotides in length. In a specific 
embodiment, the predetermined binding sequence is SEQ ID NO:l or SEQ ID NO:2 or a 
25 complement thereof. 

In one ^bodiment, the biopolymer probes consist of a sequence in the range of 20- 
100 nucleotides. 

Preferably, the predet^mined binding sequence of the quality control biopolymer 
probe is between 10-75% of the length of the length of the biopolymer probes on flie array 
30 that are not quality control probes. 

la a specific embodiment, the predetermined binding sequence consists of 25 
monomers, and the biopolymer probes on the array that are not said quality control probes 
consist of 60 monomers. 
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The invention also provides a method of determining if a positionally-addressable 
biopolymea: array has a synthesis defect comprising the following steps in the order stated: 

a) contacting an array of the invention with a sample comprising a binding 
partner that binds said predetermined binding sequence; 
5 b) detecting or measuring binding between two or more of said quality control 

probes and said binding partner in the sample; and 

c) comparing binding of said two or more of said quality control probes, 
wherein if said binding is similar, the absence of a ^thesis defect between said sequential 
cycles of synthesis of said array is indicated. 
10 The invention further provides a method of determining if a positionally-addressable 

biopolymer array has a synthesis defect comprising the following steps in the order stated: 

a) contacting an array as described above containing the quality control probes 
comprising the 0 to N monomer contiguous sequence, with a sample comprising a binding 
partner that binds said predetemiined binding sequence; 
15 b) detecting or measuring binding between (i) two or more of said quality 

control probes that differ in the number of said monomers; and (ii) said binding partner in 
the sample; and 

c) comparing binding of said two or more of said quality control probes, 
wherein if said binding is similar, the absence of a synthesis defect between said sequential 
20 cycles of synthesis used to synthesize said two or more quality probes is indicated. 

The invention further provides a method of determining if a positionally-addressable 
biopolymer array has a synthesis defect caused by a nozzle failure comprising the following 
steps in the order stated: 

a) contacting the array of the invention witii a sample comprising a binding 
25 partner that binds said predetermined binding sequence, wherein at least a portion of said 

pluraUty of quality control probes is arranged in a periodicity of P and wherein said array is 
synthesized by step-by-step synthesis using an inkjet printhead with P nozzles, wherein P is 
a whole number equal to or greater than 1; 

b) detecting or measuring binding between two or more of said quality control 
30 probes and said binding partner in the sample; and 

c) comparing binding of said two or more of said quality control probes in a 
periodicity of P, wherein if said bmding is similar, the absence of a nozzle defect is 
indicated. 
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In the foregoing methods, the comparing step can comprise determining the binding 
ratio of two of said two or more quality control probes, wherein said binding ratio is the 
amount of binding of a first of said two quality control probes with said binding partner, 
divided by the amount of binding of a second of said two quality control probes with said 
5 binding partner, and wherein said blading ratio between 0.5 and 2,0 indicates the absence 
of said synthesis defect. 

In a specific embodiment, the foregoing methods fiulher comprise before step (a) 
the step of synthesizing said array. 

In a q)ecific embodiment, the sample comprises (i) total cellular KNA or mKNA 
1 0 from one or more cells or a plurality of nucleic acids derived tiierefirom, and (ii) said 
binding partner, wherein said binding partner is not expressed by said cells. 

The invention also provides a method of making a positionally-addressable array of 
a plurality of different biopolymer probes comprising synthesizing said plurality of different 
biopolymer probes on a substrate firom monomers using a step-by-step synthesis such that 
1 5 each of said different biopolymer probes is attached to said substrate at a different position 
on said substrate, wherein said plurality of different biopolymer probes comprise a plurality 
of quality control probes, each quality control probe in said plurality comprising the same 
predetermined binding sequence, wherein the synthesis of said predetermined binding 
sequence in each of said quality control probes is initiated during said step-by-step 
20 synfliesis at sequential cycles of synfliesis. The array tiius made can have the characteristics 
described above. 

The invention fiirther provides an oligonucleotide comprising a nucleotide sequence 
of SEQ ID NO: 1 or SEQ ID NO:2 or the complement thereof. 

4. DESCRIFnON OF TH^^ FiaURTCS 
25 FIG. 1 illustrates an ink jet oligonucleotide microarray that was synthesized with 

three malfunctioning nozzles. Entire rows corresponding to nozzles 4, IS, and 20 were not 

synthesized due to nozzle malfimction. 

BIGS. 2A-2B schematically illustrate the use of quality control probes with spacers 

to determine the synthesis quality of an oligonucleotide microarray. (A) The 25 nucleotide 
30 long probe was either synthesized directly onto the microarray or was attached to a spacer 

ofvarying lengths (/.e., 20 nucleotides or 35 nucleotides). (B) A synthesis error in 

synthesis cycle 24 is depicted and thus affects the sequence of monomers in the 
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predetennined binding sequence in only the first two quality control probes shown. The 
solid line depicts the quality control probe and the dashed line depicts the spacer. 

FIGS. 3A-3B schematically illustrate the use of staggered start quality control 
probes to determine the synthesis quality of an oligonucleotide microairay. (A) A series of 
25 nucleotide quality control probes are synthesized directly on the microairay starting at 
synthesis cycle 1 through synthesis cycle 36. The only difference between the quality 
control probes is the synthesis cycle at which synthesis begins. (B) A synthesis error in 
synthesis cycle 29 is depicted and thus only affects the quality control probes in which 
synthesis cycle 29 was actually used to add a monomer to the sequence of the quality 
control probe {te., those quality control probes that begin synthesis at synthesis cycles 5- 
29). The bold line depicts the quality control probe and the thin line depicts synthesis 
cycles that had no monomer deposited. 

FIGS. 4A-4B illustrate the use of quality control probes comprising a spacer to 
detCTnine the synthesis quality of an oligonucleotide microarray when there were no known 
or detectable synthesis defects during oligonucleotide microarray synthesis. (A) 
Microarray image after hybridization to a fluorescently labeled oligonucleotide that 
hybridized to the quality control probes. (B) Higher magnification of the microarray in (A) 
fliat depicts tiie positions of the 2Smer, 40mer, and 6Qmer. 

FIGS. 5A-5B illustrate the use of quality control probes comprising a spacer to 
determine the syndesis quality of an oligonucleotide microarray when the first synthesis 
cycle was intentionally skipped during oligonucleotide microarray synthesis. (A) 
Microarray image after hybridization to a fluorescently labeled oligonucleotide that 
hybridized to the quality control probe. (B) Highermagnificationof the microarray in (A) 
that depicts the positions of the 25mer, 40mer, and 60mer. 

FIGS. 6A-6B illustrate the use of quality control probes comprisiag a spacer to 
determine the synthesis quality of an oligonucleotide microarray when the first and second 
synthesis cycle were intentionally skipped during oligonucleotide microarray synthesis. (A) 
Microarray image after hybridization to a fluorescently labeled oligonucleotide that 
hybridized to the quality control probe. (B) Highermagnificationof tiie microarray in (A) 
that depicts the positions of the 25mer, 4()mer, and 6Qmer. 

FIGS. 7A-7B illustrate the use of quality control probes comprising a spacer to 
determine the synthesis quality of an oligonucleotide microarray when the thirty sixth 
synthesis cycle was intentionally skipped during oligonucleotide microarray synthesis. (A) 
Mioroarray image after hybridization to a fluorescently labeled oligonucleotide that 
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hybridized to the quality control probe. (B) Highermagnificatioiiof theinicroarray in(A) 
that depicts the positions of the 25mer, 4Qmer, and 60mer. 

FIGS. 8A-8B illustrate the use of quality control probes comprismg a spacer to 
detemaine the synthesis quality of an oligonucleotide microarray when the thirty fowt&x and 
5 thirty fifth synthesis cycles were intentionally skipped during oligonucleotide microarray 
synthesis. (A) Microarray image after hybridization to a fluorescently labeled 
oUgonucleotide that hybridized to the quality control probe. (B) Higher magnification of 
the microarray in (A) that depicts the positions of the 25mer, 40mer, and 60mer. 



10 determine the synthesis quality of an oligonucleotide microarray when there was inefficient 
synthesis in the first twenty two synthesis cycles during oligonucleotide microarray 
synthesis. (A) Microarray image after hybridization to a fluorescently labeled 
oUgonucleotide that hybridized to the quality control probe. (B) Higher magnification of 
the microarray in (A) that depicts the positions of the 2Smer, 40mer, and 60mer. 

15 FIGS. lOA-lOB illustrate the use of staggered start quality control probes to 

determine the synthesis quality of an oligonucleotide microarray when tiiere was inefficient 
synthesis in the first and second synthesis cycles during oligonucleotide microarray 
synthesis. (A) Microarray image after hybridization to a fluorescently labeled 
oUgonucleotide that hybridized to the quaUty control probe. (B) The mean fluorescence 

20 intensity plot of tiie quaUty control probes at each synthesis cycle. 

FIGS. 11 A-llB illustrate the use of staggered start quaUty control probes to 
determine the synthesis quaUty of an oligonucleotide microarray when there was inefficient 
synthesis in the first five synthesis cycles during oUgonucleotide microarray synthesis. (A) 
Microarray image after hybridization to a fluorescently labeled oUgonucleotide that 

25 hybridized to the quaUty control probe. (B) The mean fluorescence intensity plot of the 
quaUty control probes at each synthesis cycle. 

FIGS. 12A-12B illustrate the use of staggered start quaUty control probes to 
determine the synthesis quaUty of an oUgonucleotide microarray when tiiere was inefficient 
synthesis in the first eight synthesis cycles during oUgonucleotide microarray synthesis. 

30 (A) Nficroarray hnage after hybridization to a fluorescently labeled oUgonucleotide that 
hybridized to the quaUty control probe. (B) The mean fluorescence intensity plot of the 
quaUty control probes at each syntiiesis cycle. 

FIGS. 13A-13B illustrate the use of staggered start quaUty control probes to 
determine the synthesis quality of an oUgonucleotide microarray when there was inefficient 



FIGS. PA-PB illustrate the use of quaUty control probes comprising a spacer to 
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synthesis in the forty fifth to sixtieth synthesis cycles during oligonucleotide microarray 
synthesis. (A) Microarray image after hybridization to a fluorescently labeled 
oligonucleotide that hybridized to the quality control probe. (B) The mean fluorescence 
intensity plot of the quality control probes at each synthesis cycle. 

5 FIGS. 14 A-14B illustrate the increased sensitivity of a single- deletion quality 

control probe. Microarray with synthesis defects in tbe thirty fourth and thirty fifth 
synthesis cycles were synthesized with quality control prob^ either (A) without or (B) with 
an intentional single deletion in the predetermined binding sequence. The labeled reverse 
complement of the flill-length 25 nucleotide predetermined binding sequence was used to 

10 hybridize with each microarray. The mean fluorescence intensity plot of the quality control 
probes at each synthesis cycle was determined for each microarray. 

FIGS. 15A-15C illustrate correlations between fluor reversed pairs for a microarray 
that had skipped the first twenty two synthesis cycles during synthesis (A); a microarray 
that had no synthesis defect (B); and a microarray tiiat had skipped the first twenty two 

15 synthesis cycles during synthesis with a microarray that had no synthesis defect (C). 

FIGS. 16A-16D illustrate correlations between oligonucleotide microarrays that had 
no synthesis defects with oligonucleotide microarrays that had the first (A), first and second 
OB), thirty sixtii (C), or thirty fourth and thirty fifth (D) synthesis cycles skipped during 



20 FIGS. 17A-17D schematically illustrate a microarray with quality control probes 

attached to the substrate. (A) outer gridline, (B) diagonal gridline, (C) internal cluster, (D) 
comer cluster. 

5. nRTAn.RD DESCRIPTION OF THK TNVRNTTON 

The object of the present invention is to assess the quality of microarray synthesis 
25 for arrays where the biopolymer probes are synthesized on the array substrate monomer by 
monomer in a step-by-step synthesis. This object is ftilfilled by the syntiiesis of quality 
control probes on the microarray to be assessed. The quality control probes are synthesized 
in the same manner as, and in conjunction with, the other biopolymer probes on the 
microarray. 

30 The quality control probes may comprise a predetermined binding sequence. This 

predetermined binding sequence has a binding partner that can be used to detect the 
presence of the predetermined binding sequence during microarray processing. In some 
embodiments, the quality control probe also comprises a chemical stmcture contiguous with 
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the predetermined binding sequence (such chemical structure referred to herein as a 
"spacer"). The spacer is preferably a polymer (e.g., a sequence) of additional monomers 
attached to (contiguous with) the predetermined binding sequence. Upon completion of 
microairay synthesis, the quality control probes are detected by binding to a labeled binding 
5 partner. The degree of binding is quantified for each quality control probe and compared to 
the binding intensities of oilier quality control probes. Similar binding intensities indicate 
synthesis was equally efficient throughout the synthesis. 

In anotha specific embodiment, the quality control probes do not comprise a 
predetermined binding sequence. In such an embodiment, the signal observed with this 

10 type of quality control probe is emitted either by 1) the monomers that make up the quality 
control probe directly or 2) a label (e.g., a dye) that interacts with or is attached to the 
monomers that make up the quality control probe. Deviation from the expected binding 
intensities indicate a defect in the array, e.g., due to a synthesis defect, or degradation 
during storage or processing. 

1 5 Although the invention is generally described in terms of the use of one group of 

quality control probes, it will be understood diat different gjcoisps of quality control probes 
can also be used on a single microarray. The dififerent groups of quality control probes may 
have dififerent predetermined binding sequ^ces or may be a mixture of quality control 
probes witti and without predetermined binding sequences. The quaUty control probes may 

20 also be a mixture of diflferent lengths (e.g., a mixture of quality control probes comprising 
predetermined binding sequences of 2Smers or 24mers). 

5.1 QUALITY CONTROL PROBES WITH 

PREDETERMINED BINDING SEOUENCES 

5.1.1 PREDETERMINED BINDING SEOUENCES 

25 Quality control probes with predetermined binding sequences are biopolymers that 

comprise a predetermined binding sequence and do not interfere with the results of flie 
intended microarray processing. So as to avoid cross-reactivity in binding, biopolymers of 
the sample to be assayed should not bind to the quality control probes on the microarray. 
Also, the reverse complement of the predetermined binding sequence used to bind to and 

30 detect the quality control probes should not bind to the test probes (i.e., probes on the 

microarray designed to bind biopolymers of the sample) on the miCToarray. In the method 
of the present invention, the quality control probe is made according to the particular 
requirements of the combination of origin, preparation, and processing of the sample to be 
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analyzed on the microarray to be synthesized. Preferably, wherein the sample to be 
analyzed on the microarray comprises naturally occurring nucleic acids or proteins, the 
predetermined binding sequmce of the quality control probes is not present or is not known 
to be present in any naturally occurring nucleic acid or is not known to encode any naturally 

5 occurring protein, respectively. In another embodiment, the predetermined bmding 
sequence of the quality control probes is not present or is not known to be present in the 
sample. This is done to reduce the likelihood that the predetermined binding sequence will 
be cross-reactive. Cross-reactivity indicates that a biopolymer has the ability to interact 
(e.g., hybridize or bind) with more than one other biopolymer present during microarray 

10 processing. For example, during processing of an oligonucleotide microarray, if the 

predetermined binding sequence hybridizes with its complementary nucleic acid as well as 
with a different sequence in the biological sample then the probe is said to be cross-reactive. 
Cross-reactivity in a probe is undesirable, since it could alter the signal intensities observed 
from sample processing and afifect the assessment of naicroarray synthesis quality. 

15 In one embodiment, the potential sequence of monomers that make up tiiie 

predetermined binding sequence can be identified from a pool of randomly synthesized 
sequences. These potential predetermined binding sequences can then be assayed for their 
cross-reactivity with the biological sample to be processed or probes designed to detect 
naturally occurring sequences in the biological sample during processing. Preferably, 

20 predetermined binding sequences that are not substantially cross-reactive with biopolymers 
being assayed in the sample are used in quality control probes. Thus, at the time of 
microarray synthesis, the sequence of the quality control probes is known althougih the 
sequence is random in that it had initially been the product of a random synthesis. The 
random sequences are biopolymer residues {e.g., nucleotide or amino acid residues) that are 

25 generated without a preplanned specific design as to the actual resulting sequence, i.e., 

when a monomer (e.g., nucleotide, amino acid) is said to be random it is ui^redictable what 
monomer will occur at tihiat residue. The random sequences can be synthesized by an 
unbiased synthesis scheme wherein each possible residue has an equal chance of being 
incorporated into the biopolymer at each position. Alternatively, the random sequences can 

30 be synthesized by a biased synthesis scheme wherein certain positions in the biopolymer 
have an increased chance of having one residue over another. Additionally, a combination 
of unbiased and biased synthesis methods can be used to synthesize any one biopolymer. In 
one embodiment, sequences on either end or at internal positions may be added to the 
predetermined binding sequence for the purposes of fiicilitating standard molecular 
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biological manipulations. Once generated, the sequence of the predetermined binding 
sequence if generated randomly is deteraiined. Preferably, the sequence is then tested for 
cross-reactivity, and recorded for future use. For each microarray one or more of the 
predetemiined blading sequences that have been empirically determined to be noncross- 

5 reactive are thra synthesized on the microarray to allow for future assessment of synthesis 
quality or other non-synthesis defects in the array. 

In another embodiment, the predetermined binding sequence can be a naturally 
occurring sequence that is not endogenous to tiie sample that is to be processed on the 
microarray. For example, if the sample is fix)m a eukaryotic source, then a bacterial 

10 sequence (or fragment thereof) can be used as the predetemiined binding sequence. Cross- 
reactivity could be assessed as a precautionary measure. 

Accordingly, where the binding partner to the predetermined sequence of the quality 
control probes is not endogenously present in the sample to be assayed for binding to the 
microarray, the binding partner to the predetermined binding sequence in the quality control 

15 probe is introduced into the sample at any time prior to or during contactmg of the sample 
with the microarray. In one embodiment, the binding partner is added to the sample during 
sample processing. In a more preferred embodiment, the binding partner is added to the 
sample immediately prior to contact of the sample with the microarray. 

The predetermined binding sequence can be made of any type of biological 

20 macromolecule; preferably the molecular nature of the quality control probe is consistent 
with that of the oflier biopolymer probes on the microarray. For example, the 
predetermined binding sequence can be composed of nucleotides DNA or RNA), 
amino acids, glycans, saccharides, or small organic molecules. 

In one embodiment, the predetermined binding sequence is a nucleic acid, 

25 preferably an oligonucleotide, and a nucleic acid microarray is contacted with a sample 
comprising a nucleic acid comprising a sequence complementary to the predetermined 
binding sequence under conditions conducive to hybridization, and the amount of 
hybridization to quality control probes is compared. 

hi another embodiment, the predetermined binding sequence is a protein 

30 (polypeptide or peptide), and a protein microarray is contacted with a sample comprising a 
binding partner to said protein under conditions conducive to binding, and the amount of 
binding to quality control probes is corxq)ared. In one embodiment, the binding moiety is an 
epitope recognized by an antibody, preferably a monoclonal antibody. Preferably, epitopes 
are unique (i.^., not endogenously expressed in cells or tissues that provide protein material 
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for the samples) to tmnimize cross-reactivity of the antibodies directed to predetennined 
binding sequence epitopes with sample epitopes during detection. 

The length of the predetermined binding sequence can vary depending upon the 
length of the other biopolymer probes on the microarray used to detect binding partners in 
5 the sample to be assessed. Typically, the predetermined binding sequence is composed of a 
smaller number of monomers than the other biopolymer probes on the microarray. This 
allows the predetennined blading sequmce to represent only a subset of the total monomers 
that make up the other biopolymer probes on the microarray. As such, multiple 
predetermined binding sequences are needed to represent each full length biopolymer 

10 probe. This allows for different cycles of synthesis to be targeted for inspection by different 
quality control probes depending upon which cycles of synthesis were used to synthesize 
the predetermined binding sequence. Binding intensities can be compared between 
different predetermined binding sequences to ascertain information regarding the different 
portions of the fiill length biopolymer probes. The predetermined binding sequence is 

15 preferably between 5-95%, 10-75%, 25-65%, 35-60%, 40-55%, or 41-48% of the length of 
the other biopolymer probes on the microairay. In another embodiment, the predetennined 
binding sequence is 15 biopolymer residues when the other probes on the microarray are 60 
biopolymer residues in length. In another embodiment, the predetermined binding sequence 
is 25 biopolymer residues when the other probes on the miCToairay are 60 biopolymer 

20 residues in length. The length of the biopolymer probes on the noicroarray that are not 

quality control probes, when nucleic acids, is preferably in the range of 10-500 nucleotides, 
more preferably 10-250, 20-100, 40-80, 50-70 or 60 nucleotides. 

5.1.1.1 PREDETERMBVED BINDING SEQUENCES 
WITH INTENTIONAL DELETIONS 

25 In some embodiments, the predetemodned binding sequence has an intentional 

deletion of one or more monomers relative to a sequence that binds a binding partus used 
to detect the quality control probe diiring microarray processing. Thus, in a specific 
embodiment, the predetermined binding sequence has an internal deletion of a nucleotide 
relative to a sequence perfectly complementary to the nucleic acid used to detect the quality 

30 control probe by hybridization. Although this does decrease the signal intensity due to an 
imperfect binding pair, signal can still be observed. Any additional deletions due to a 
failure during microarray synthesis would exacerbate the difiEerence between predetennined 
binding sequence and binding partner and thus serve to drastically reduce the signal 
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observed during microairay processing. In one embodiment, on an oligo nricroarray, the 
predetennined binding sequence is a 24mer (i.^., has one monomer intentionaUy deleted) 
and the binding partner is a 2Smer. 

In one embodiment, each quality control probe on a microarray comprises a 
predeteraiined binding sequence comprising one or more such intentional deletions. In 
another embodiment, the quality control probes on a microairay are a mixture of those 
comprising predetennmed binding sequences comprising one or more intentional deletions 
and those comprising predetermined binding sequmces with no intentional deletions. 

5.1,2 SPACERS 

In some embodiments, the quality control probes comprise a chemical structure 
contiguous with the predetennined binding sequence. This chemical structure is referred to 
herein as a spacer. The spacer is preferably made up of 0 to N monomers nucleotides, 
amino acid residues), where N is a whole number integer equal to or greater than 1. 
Preferably, the spacers added are less than 75%, less than 50%, less than 25%, less lhan 
20%, less than 15%, less than 10%, less than 5%, or less than 1% of the total sequence of 
the quality control probe. Spacers can be on one side of the predetermined binding 
sequence or on both sides. For nucleic acid probes, die spacers can be either 5' or 3' or both 
5' and 3' to the predetennined binding sequence. In one embodiment, the spacers are 
exclusively 3* to the predetermined binding sequence. For protein probes, the spacers can 
be either amino- or caiboxy-terminal or both amino- and carboxy-terminal to the 
predetermined binding sequence. In a specific embodiment, the spacer is a nucleotide or 
protein sequence. 

In one embodiment, the value of the upper limit of N is determined by the length of 
the biopolymer probes synthesized on the microarray that are not quality control probes 
those not containing the predetermined binding sequence). The total length of the 
quality control probe is preferably not greater than the total length of the other biopolymer 
probes on the microarray. Thwefore, in a specific embodiment, N plus the number of 
monomers in the predetermined binding sequence should equal the total number of 
monomers in the biopolymer probes on the array that are not quality control probes. 

In another raibodiment, the value of N is not constrained by the length of the 
biopolymer probes synthesized on the microarray that are not quality control probes (z.e., 
those not containing the predetermined binding sequence). In this embodiment, quality 
control probes can be shorter or longer tiian the other biopolymer probes on the microarray. 
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Spacers are preferably not cross-reactive with the biopolymer being assayed in the 
sample. During microarray processing, preferably no signal is detected from the spacer. 
Additionally, the spacer should not interfere with the signal generated from the 
predetermined binding sequence binding to its binding partner during microarray 
5 processing. In one embodiment, interference with such signal is prevented because the 
chemical structure that makes up the spacer is modified such that the spacer is not able to 
bind a binding partner. For example, modified nucleic acids that are not competent to 
hybridize can be used in spacers and will be non-cross-reactive, e.g., abasic nucleotides 
(i.e., moieties lacking a nucleotide base, but having the sugar and phosphate portions) (see 

10 generally U.S. Patent 6,248,878; Takeshita et al, 1987, J. Biol Chem. 262:10171; abasic 
nucleotides are commercially available from Glen Research in Sterling, Virgmia). In 
another embodiment, spacers can be made of a chemical moiety that is different from the 
monomers present in the other biopolymer probes on the microarray not dedicated to quality 
control and/or the monomers that make up the predetermined binding sequence. For 

15 example, on a nucleotide microarray, spacers can be made fix)m non-nucleotide moieties 
such as polyether, polyamine, polyamide, or poljiiydrocarbon compounds. Specific 
examples include those described by Seela and Kaiser, 1990, Nucleic Acids Res. 18:6353; 
Seela and Kaiser, 1987, Nucleic Acids Res. 1987, 15:3113; Cload and Schepartz, 1991, J. 
Am. Chem. Soc. 113:6324; Richardson and Schepartz, 1991, J. Am. Chem. Soc. 113:5109; 

20 Ma et al., 1993, Nucleic Acids Res. 21:2585; Ma et al., 1993, Biochemistry 32:1751; Durand 
etal., I990y Nucleic Acids Res. 18:6353; McCurdyetal., 1991, Nucleosides &Nucleotides 
10:287; Jaschke et al., 1993, Tetrahedron Lett. 34:301; Ono et al., 1991, Biochemistry 
30:9914; Ferentz and Verdine, 1991, J. Am. Chem. Soc. 113:4000; U.S. Patent 6,362,323; 
International Publication No. WO 89/02439. 

25 Preferably, once generated, the entire quality control probe sequence is determined, 

tested for cross-reactivity, and recorded for fixture use. 

5.2 QUALITY CONTROL PROBES WITHOUT 
PREDETERMINED BINDING SEQUENCES 

In some embodiments quality control probes do not have predetermined binding 

30 sequences but are made exclusively of a spac^. Signals observed with this type of quality 

control probes are emitted eith^ 1) directiy from the chemical stmcture (e.g., the 

monomers) that make up the quality control probe or 2) indirectly tirougih the use of a 

labeled dye which interacts with the chemical structure {eg., the monomers) that make up 
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the quality control probe. These types of quality control probes can give off a signal 
without the use of a labeled binding partner. 

Thus, in one embodiment, quality control probes are synthesized with labeled 
monomers. The labeled monomers can be, for example, fluorescently labeled {e.g.. Cy3, 
5 CyS) nucleotides or fluoresceatly labeled amino acids. Other labels that can be used 
include, but are not limited to, electron rich molecules and radioactive isotopes. Each 
quality control probe incorporates one or more labeled monomers during synthesis. 

In a specific embodiment, the synthesis cycle in which the labeled monomer is 
incorporated into the quality control probe is varied with each quality control probe. Each 
10 cycle of synthesis is represented by at least one, but preferably more than one, quality 
control probe having a label in the monomer deposited in that synthesis cycle. Should a 
synthesis defect occur, no labeled monomer is incorporated and the deficiency can be 
detected. In a preferred aspect, each quality control probe is the same length. 

In another specific embodiment, the quality control probe is made of the same 
15 number of monomers that make up the test probes on the microarray those probes on 
the microarray that are not quality control probes) with one of the monomers being labeled. 

In another specific embodiment, the quality control probes are varying lengths such 
that there is at least one, but preferably more than one, quality control probe that terminates 
at each cycle of synthesis, hi such quality control probes, the last monomer of each of the 
20 quality control probes is a labeled monomer. 

In another embodiment, quality control probes are synthesized with no 
predetermined binding sequence using unlabeled monomers. The signal generated reUes on 
the monomers' intrinsic ability to generate a signal, e.g„ to fluoresce. Nucleic acid quality 
control probes of varying lengths can be synthesized and the microarray can be seamed 
25 prior to processing by hybridization to labeled probes. The degree of fluorescence observed 
should correlate with the length of the quahty control probes due to an increased number of 
monomers (nucleotides) in longer probes. 

In another embodiment, a labeled dye that directly binds to the monomers of the 
quality control probes can be used to generate a detectable signal. For example, for a 
30 nucleic acid microarray^^ various fluorescent nucleic acid stains can be used such as POPO, 
SYBR Green I, SYBR Green II, SYTO 59, and SYTO 61 (available from Molecular 
Probes, Inc. in Eugene, OR). After assessing the microarray synthesis efficiency, the dyes 
can be removed prior to incubation of the microarray with test samples. 
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5,3 QUALITY CONTROL PROBE SYNTHE SIS ON MICRO ARRAYS 
During a st6p-by-stq> biopolymer piobe synthesis onto the microarray substrate, 
there can be faulty monomer addition at one or more synthesis cycles of synthesis at one or 
more areas on the microarray. To discern if such a synthesis error occurred, quality control 
probes of the invention are synthesized at different places on the microarray and the sig^als 
of the different quality control probes are compared Significant signal deviation firom what 
is e3q)ected indicates a synthesis defect (see Section 5.4). 

5.3.1 VERTICAL PLACEMENT 

Jn one embodiment, quality control probes that generate a detectable signal either by 
binding to a predetermined binding sequence or by incorporation of labeled monomers can 
be displaced firom each other vertically to assess the efficiency of all cycles of synthesis. In 
one embodiment, synthesis of the predetermined binding sequences is initiated during the 
step-by-step monomer addition at different cycles of synthesis. Therefore, alfliou^ each 
predetermined binding sequence for a group of quality control probes is identical, the cycles 
of synthesis creating the predetermined binding sequence on the microarray are displaced 
firom each other in a vertical fashion. In another embodiment, the cycle of synthesis in 
which the labeled monomer is incorporated into the quality control probe is varied such that 
each cycle of synthesis shoiild have incorporated a labeled monomer m at least one quality 
control probe. These methods can be used to pinpoint the cycle of synthesis that was 
affected by faulty monomer addition. 

In one embodiment, this vertical displacement is accomplished through the use of 
spacers. For example, by varying the number of monomers in a spacer, the synthesis cycle 
of the microarray at which synthesis begms of the predetermined binding sequence will 
also vary. Consequently, this makes each predetermined binding sequence vulnerable to 
defects in monomer addition occurring at different cycles in the synthesis. Should there be 
no synthesis defects during microarray synthesis, then the binding partner of the 
predetermined binding sequence should bind equally well similarly) to the 
predetermined binding sequence on all of the quality control probes. In determining if the 
binding partner of the predetermined binding sequence on the different quality control 
probes are binding similarly, it must be appreciated that, when the quality control probes 
comprise spacers, differences in binding may be due in part to the distance the 
predetermined binding sequence is firom the microarray (see Section 5.4). The binding 
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differences thus expected due to the different spacer lengths are tiius preferably ignored 
when detennining whether the different quality control probes are binding "similarly". 

la a specific embodiment, the quality control probes of a groixp all comprise 
identical predetermined binding sequences but differ in the overall number of monomers in 
5 the quality control probe due to a varying number of monomers comprising the spacers. 

In another embodiment, this vertical displacement is accomplished by varying the 
synthesis cycle of the microarray at which the labeled monoma is incorporated into the 
quality control probe. Consequently, this makes each labeled monomer addition vulnerable 
to defects in monomer addition occurring at different cycles in the syn&esis. Should there 
10 be no synthesis defects dxiring microarray synthesis, then each quality control probe should 
have incorporated and equal number of labeled monomers and thus will give comparable 
signals. 

In another embodiment, this vertical displacement is accomplished with a staggered 
start synthesis. As above, each predetermined binding sequence is displaced in its start of 

15 synthesis with respect to each other by one or more sequential cycles of monomer addition. 
In one embodiment, spacers are used to accomplish this displacement. In a more preferred 
embodiment, spacers are not used to accomplish this displacement. Rather, monomer 
addition is delayed at the position on the microarray to be occupied by the predetermined 
binding sequence until microarray synthesis has reached the cycle at which synthesis of the 

20 predetermined binding sequence is to be initiated. In this embodimmt, all quality control 
probes comprise the same number of monomers but the synthesis using these monomers at 
different positions on the microarray (corresponding to the quality control probes) is 
separated temporally. 

5.3.2 HORIZONTAL PLACEMENT 

25 The quahty control probes of the invention can be synthesized on the microarray 

substrate in a number of different locations to make up a number of different patterns. 
These patterns can be used to identify areas of microarray synthesis defects as well to 
impart positional information to the microarray during processing. The number of quality 
control probes on a microarray should be sufficient to adequately represent the synthesis 

30 across the entire microarray. For example, the number of probes on the microarray that are 
quality control probes should be about 0.5% or more, 1% or more, 2% or more, 3% or 
more, 5% or more, 10% or more, 20% or more, of the total probes on the microarray. 
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In one embodiment, one or more rows of quality control probes (called gridlines) 
can be synthesized at different positions throughout the microarray. Each section of the 
microarray can contain a gridliae to ensure that all sections have been assessed for 
competent synthesis. In one embodiment, the integrity of biopolymer probe synthesis at the 
5 edge of the microarray can be monitored through the use of an outer (or perimeter) gridline, 
e.g., of 1-5 adjacent borders of quality control probes (FIG. 17A). Sections of the 
microarray near or at the edge can be dedicated to quality control probes such that any 
defect can be detected should it be present In another embodiment, ttie integrity of 
biopolymer probe synthesis in the center of the microarray can be monitored through the 

10 use of a diagonal gridline (FIG. 17B). Quality control probes can be synthesized in 

positions that traverse the array diagonally thus traversing representative sections of the 
microarray. In a preferred embodiment, gridUne patterns are made up of quality control 
probes containing spacers. 

In another embodiment clusters of quality control probes can be synthesized in 

1 5 sections of the microarray to assess synthesis quality. In one embodiment, quality control 
probes are synthesized in randomized positions throughout the middle of the array (FIG. 
17Q. In another embodiment, quality control probes can be synthesized at the comers of 
the microarray (FIG. 17D). 

la another embodunent, when the microarra^^ are synthesized by ink jet technology, 

20 the quality control probes can be arranged on the microarray such that failures of particular 
nozzle(s) can be detected. A reduction in signal intensity in quality control probes that have 
a periodicity consistent with being printed by a particular nozzle can signify that that nozzle 
has been repeatedly defective. When there are N nozzles in the ink jet head, a reduction in 
quality control probe intensity with a periodicity of N signifies a clogged or defective 

25 nozzle (wherein N is a whole number of 1 or greater). Li one embodiment, N is 20. In a 
further embodiment, the diagonal gridline (FIG. 17B) is used to assess nozzle clogs or 
defects. 

In another embodiment, quality control probe patterns can be used to impart 
positional information about the microarray. Because the sites at which the quality control 
30 probes are synthesized during microarray synthesis are known, probes can be used to align 
the microarray during processing. 
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5.4 DETECTION OF DEFECTS ON A MICROARRAY 
All of the quality control probes that comprise the same predetermined bindmg 
sequence should bind to the binding partner similarly. However, m many instances, the 
inventors have found that spacers that increase the distance of the predetermined binding 
sequence from the microarray actually increase the signal intensity upon binding of a given 
predetermined binding sequence when compared to the signal observed from an identical 
predetemiined binding sequence attached directly to flie microarray. Without being bound 
by a particular mechanism, the increased signal intensity may result from the predetermined 
binding sequence being more accessible to its bmding partner by virtue of its being fiirther 
away from the microarray {e.g„ by having spacers directly attached to the microarray 
comprising an increasing number of monomers contiguous with the predetermined bmding 
sequence). A deviation in the amount of binding between different quality control probes 
and the binding partner that is greater than that expected due to differing distance of the 
predetermined binding sequrace from the microarray may indicate a problem in microarray 
quality. Defects in microarray quality may be global the defect affects the entire 
microarray) or localized (i.e., the defect affects one or more areas of the microarray and 
does not affect other areas). 

In a specific embodiment, use of the quality control probe of the invention allows 
detection of microarray synthesis defects (e.g., a flow cell gradient where bubbles or other 
problems in the flow cell lead to non-uniform reagent coverage of the microarray during 
some of the synthesis cycles). However, other types of defects affecting microarray quality 
can also be detected by use of the quality control probes of the invention. Defects m the 
microarray can be due to occurrences other than synthesis defects. Quality control probes 
can be used to detect these types of defects as well. In one embodiment, microarray defects 
detectable by the methods of the invention occur during storage of the microarray. 
Suboptimal conditions (e.g., improper temperature or moisture level) can cause microarray 
quality to deteriorate. Other defects that are detectable by the methods of the invention 
include but are not limited to an abrasion that causes a localized defect on a microarray. 
Such an abrasion can occur during storage or processing of the microarray. A defect can 
occur during processing of tiie microarray. Such a defect can cause a non-uniformity of 
signal that can be detected by comparing signal intensities across the microarray. 
Comparison of binding intensities can be accomplished in a number of ways. 

In one embodiment, a binding ratio for each set of quality control probes can be 
calculated. For quality control probes comprising predetermined binding sequences and not 
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comprising spacers, signals generated during microarray processing for a particular quality 
control probe should equal signals generated for another different quality control probe. A 
ratio of the two signals should approach 1. Deviation from 1 indicates that one of the two 
quality control probes used in the calculation had decreased binding to its binding partner. 
5 Such would be the case if a synthesis defect caused the predetennined binding sequence in 
the quality control probe to be defective and thus unable to bind its binding partner at 
normal levels. For quality control probes comprising both predetermined binding sequence 
and spacers, signals generated during microarray processing for a particular quality control 
probe may or may not equal signals generated for another different quality control probe 

10 due to the differences in distance form the microarray. A ratio of the two signals from 
predetermined binding sequences that are a similar distance from the microarray {e.g., the 
synthesis of each predetermined binding sequence was initiated within 3 cycles of synthesis 
from each other) should approach 1. However, a ratio of the two signals from 
predetermined binding sequences that are differ^t distances from the microarray (e.g., the 

1 5 synthesis of each predetennined binding sequence was initiated greater than 3 cycles of 
synthesis from each other) could deviate from 1. In this instance, the ratio expected can be 
determined using data from rnicroarrays known to have no defects. Such microarrays can 
be identified, e.g., by making a plurality of arrays (preferably at least 5) and comparing the 
results to identify ones with no defects. Deviation from this determined expected ratio can 

20 then be used to detect defects in microarrays. 

For each type of microarray (e.g., oUgonucleotide, protein, etc.), the range of 
binding ratio values that indicates that there is no defect can be determined empirically. For 
example, various predetennined binding sequences known to be without defect can be 
bound to their binding partner and signals recorded. This can serve as the baseline values 

25 used to determine the expected binding ratios. By varying the horizontal and vertical 

placement of the quality control probes on the microarray, a range of acceptable ratios can 
be determined. Deviation from these empirically determined ratios indicates a defective 
microarray. In a specific embodiment, when the microarray is an oligonucleotide 
microarray, a binding ratio of between 0.2S and 2.25, 0.5 and 2.0,or 0.75 and 1.25 indicates 

30 no synthesis defect 

In a specific embodiment, for microarrays using quality control probes that are a 
mixture of those comprising predetenzuned binding sequences con^rising one or more 
intentional deletions relative to a sequence that binds a binding partner used to detect the 
quality control probe during microarray processing and those comprising predetermined 
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binding sequences with no intentional deletion, binding ratios can be calculated and used to 
assess microarray quality. Signals generated by binding of a labeled binding partner to each 
type of predetennined binding sequence either with or without intentional deletions) 
will necessarily be different Binding intensities determined from microarrays known to 
5 have no defects can be used to calculate expected binding ratios. Such microarrays can be 
identified, e.g., by making a plurality of arrays (preferably at least 5) and comparing the 
results to identify ones with no defects. Deviation from the expected ratio indicates a 
defect. 

In another embodiment, comparison of binding intensities can be accomplished 

1 0 through a statistical analysis. The mean binding intensity for a group of quality control 
probes can be calculated by averaging the value of ttie signal {e.g., fluorescence) observed 
for each. The amount of signal observed for each individual quality control probe can thm 
be compared to the mean of the group. In one embodiment, those quality control probes 
that are within two standard deviations from the mean indicate that there is no quality defect 

15 in flie microarray, e,g., that there was no defect during their synthesis, or incurred during 
processing, storage, or otherwise. In a more preferred embodiment, those quality control 
probes that are within one standard deviation from the mean indicate that there is no defect. 

In another embodiment, more than one fluorescent dye can be used to label the 
binding partner which binds to the predetermined binding sequence. For example, a subset 

20 of the bmding partners can be labeled with Cy3 and a subset can be labeled with Cy5. A 
ratio of signal detected from a single quality control probe for each type of fluor used can be 
determined. By varying the horizontal and vertical placement of the quality control probes 
on the microarray, a range of acc^table ratios can be determined. Deviation from this 
empirically determined ratio indicates a microarray defect. 

25 For microarrays using quality control probes without predetennined binding 

sequences and synthesized with labeled monomers, similar methods can be used to detect 
defects. Instead of the signal originating from the labeled binding partner of the 
predetermined binding sequence, it will come from the quality control probe itself that is 
attached to the microarray. Rados and standard deviations from the mean signal can be 

30 used to assess integrity of the microarray. 

For microarrays using quality control probes without predetermined binding 
sequences or labeled monomers, similar methods can be used to detect quality defects. In 
these midoairays, however, die detectable signal is proportional to the length of the quality 
control probe; thus» signal intensities should not be similar for each quality control probe of 
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a differing lengths. Rather, a more intense signal is expected from longer quality control 
probes. Deviation from the differences e3q)ected to be seen between probes indicates a 
defect in the microarray. 

In one embodiment, when mixtures of quality control probes are used, expected 
5 binding ratios or signal intensities can be determined empirically. Microarrays that are 
known to contain no defects can be used to get baseline values for predetermined binding 
sequence binding to its binding partner or signal intensities for each of the different types of 
quality control probes. Ratios can be determined from this data and used as the expected 
ratios. Deviation from these ratios indicates a defective microarray. 

10 5.5 MICROARRAY SYNTHESIS AND PROCESSING 

Hie probes on microarrays can be any one of a number of different biopolymers, 
e.g., DNAs, RNAs, peptide nucleic acids (PNAs) (see e.g., Eghohn et al., 1993, Nature 
363:566-568; U.S. Patent No. 5,539,083), or proteins. The microarrays of the invention are 
synthesized by a step-by-step addition of monomers onto a solid support. Each such 

15 monomer is a unit of biopolymer that is added during one synthesis cycle. In one 

embodiment, the unit of biopolymer added per synthesis cycle is itself composed of not 
more than one basic biopolymer unit (e.g., a nucleotide, amino acid, etc.). In another 
embodiment, the unit of biopolymer added per synthesis cycle consists of more than one 
basic biopolymer unit (e.g., a dinucleotide, a dipeptide, a nucleotide or amino acid 

20 covalently linked to another moiety, etc.). In another embodiment, the unit of biopolymer 
added per synthesis cycle varies with different synthesis cycles. 

5.5.1 NUCLEOTIDE MICROARRAYS 

In a preferred embodiment in the present invention, sample processing is through 
hybridization on a nucleotide microarray. In a more preferred embodiment, the microarray 

25 is an oligonucleotide array. In a most preferred embodiment, the oligonucleotide array is an 
ink jet-synthesized oligonucleotide microarray. Preferably, the microarray contains in the 
range of 20 to 50,000 nucleic add probes. The probes can be arranged in a variety of 
patterns. For example, the probes can be arranged in rows and colunoois, polygonal {e.g., 
hexagonal), or circular patterns, etc. 

30 Hybridization levels are preferably measured using polynucleotide probe arrays or 

microarrays. On a polynucleotide array, polynucleotide probes comprising sequences of 
interest are immobilized to the sur&ce of a support, e.g,, a solid siq>port. For example, the 
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probes may comprise DNA sequences, RNA sequences, or copolymer sequences of DNA 
and RNA. The polynucleotide sequences of the probes may also comprise DNA and/or 
RNA analogues {e.g., peptide nucleic acids), or combinations thereof. For example, the 
polynucleotide sequences of the probe may be full or partial sequences of genonuc DNA or 

5 mRNA derived from cells, or may be cDNA or cRNA sequences derived therefiom. 
The probe or probes used in the methods of the invention are preferably 
immobilized to a solid support or surface which may be eithw porous or non-porous. For 
example, the probes of the invention maybe polynucleotide sequences which are attached 
to a nitrocellulose or nylon membrane or filter. Such hybridization probes are well known 

10 in the art (see, e.g., Sambrook et aL, Eds., 1989, Molecular Cloning: A Laboratory Manual, 
Vols, 1-3, 2nd ed.. Cold SpHng Harbor Laboratory, Cold Spring Harbor, New York). 
Alternatively, the solid support or surface may be a glass or plastic surface. 

5.5.1.1 HYBRIDIZATION ASSAY USING MICROARRAYS 

A microarray is an array of positionally-addressable binding (e.g., hybridization) 

15 sites on a support. Each of such binding sites comprises a plurality of polynucleotide 

molecules of a probe boimd to the predetermined region on the support, Microarrays can be 
made in a number of ways, of which several are described herem below (see e.g., Meltzer, 
2001, Ctt/r. Opin. Genet. Dev. ll(3):258-63; Andrews et al., 2000, Genome Res. 
10(12):2030-43; Abdellatif, 2000, Circ. Res. 86(9):919-20; Lennon, 2000, DrugDiscov. 

20 Today 5(2):59-66; Zweiger, 1999, Trends Biotechnol. 17(1 1):429-36). However produced, 
microarrays share certain characteristics. The arrays are preferably reproducible, allowing 
multiple copies of a given array to be produced and easily compared with each other. 
Preferably, the nricroarrays are made &om materials that are stable under binding (e.g., 
nucleic acid hybridization) conditions. The microarrays are preferably betwem Icm^ and 

25 25cm^, preferably about 10 cm^ to 15cm^. However, both larger and smaller (e.g., 0.5 cm^ 
or less) arrays are also contemplated and may be preferable, e.g., for simultaneously 
evaluating a very large number of different probes. 

In a particularly preferred embodiment, hybridization levels are measured to 
microarrays of probes consisting of a solid phase on the surface of which are immobilized a 

30 population of polynucleotides, such as a population of DNA or DNA mimics or, 

alternatively, a population of RNA or RNA mimics. The solid phase may be a nonporous 
or, optionally, a porous matmal such as a gel. Microarrays can be employed, e.g.» for 
analyzing tiie transcriptional state of a cell such as the transcriptional states of cells exposed 



25 



wo 2004/003233 ^^PCT/US2003/020504 

to graded levels of a drug of interest or to graded perturbations to a biological pathway of 
interest Microarrays can be used to simultaneously screen a plurality of different probes to 
evaluate, e.g., each probe's sensitivity and specificity for a particular target polynucleotide. 
Preferably, a given binding site or unique set of binding sites on the microairay will 
5 specifically bind (e.g., hybridize) to the product of a single gene or gene transcript fi-om a 
cell or organism (e.g., to a specific mKNA or to a specific cDNA derived therefirom). 
However, in general, other related or similar sequences may cross hybridize to a given 
binding site. 

The microarrays used in the methods and compositions of the present invention 

1 0 include one or more test probes, each of which has a polynucleotide sequence that is 

complementary to a subsequence of RNA or DNA to be detected. Each probe preferably 
has a different nucleic acid sequence, and the position of each probe on the solid surface of 
the array is preferably known. Indeed, the microarrays are preferably addressable arrays, 
more preferably positionally addressable arrays. More specifically, each probe of the array 

15 is preferably located at a known, predetermined position on the solid support such that the 
identity (Le., the sequence) of each probe can be determined fiom its position on the array 
on the support or surface). 
Preferably, the density of probes on a microarray is about 100 different non- 
identical) probes per Icm^ or higher. More preferably, a microarray used in the methods of 

20 the invention will have at least 550 probes per Icm^ at least 1000 probes per Icm^, at least 
1 500 probes per Icm^ or at least 2000 probes per Icm^. In a particularly preferred 
embodiment, the microarray is a high density array, preferably having a density of at least 
about 2500 different probes per Icm^. The microarrays used in the invention therefore 
preferably contain at least 2500, at least 5000, at least 10000, at least 15000, at least 20000, 

25 at least 25000, at least 50000 or at least 55000 different (i.e., non-identical) probes. A 
subset of these probes will correspond to spike-in tags which may have been added to the 
sample. 

Such polynucleotides are preferably of the length of 15 to 200 bases, more 
preferably of the length of 20 to 100 bases, most preferably 40-60 bases. It will be 
30 understood that each probe sequence may also comprise a linker ie.g., spacer) in addition to 
the sequence that is complementary to its target sequence. As used h^in, a linker refers to 
a chemical structure between the sequence that is complementary to its target sequence and 
the surface. The linker need not be a nucleotide sequence. For example, die linker can be 
composed of a nucleotide sequence, or peptide nucleic acids, hydrocarbon chains, etc. 
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la one embodiment, the microarray is an array (/.e., a matrix) in which each position 
represents a discrete binding site for a transcript encoded by a gene {e.g., for an mRNA or a 
cDNA derived therefix>m). For example, in various embodiments, the microairays of the 
invention can comprise binding sites for products encoded by fewer than 50% of the genes 

5 in the genome of an organism. Alternatively, the microarrays of the invention can have 
bmding sites for the products encoded by at least 50%, at least 75%, at least 85%, at least 
90%, at least 95%, at least 99% or 100%, or at least 50, 100, 500, 1000, or 10000 of the 
genes in the genome of an organism. In other embodiments, the microarrays of the 
invention can having binding sites for products encoded by fewer than 50%, by at least 

10 50%, by at least 75%, by at least 85%, by at least 90%, by at least 95%, by at least 99% or 
by 100% of the genes expressed by a cell of an organism. The binding site can be a DNA 
or DNA analog to which a particular RNA can specifically hybridize. The DNA or DNA 
analog can be, e,g., a synthetic oligomer or a gene fi-agment, e.g. corresponding to an exon. 
Preferably, the microarrays used in the invention have binding sites (/.e,, probes) for 

15 sets of genes for one or more genes relevant to flie action of a drug of interest or in a 

biological pathway of interest As discussed above, a "gene" is identified as a portion of 
DNA that is transcribed by RNA polymerase, which may mclude a 5' untranslated region 
(UTR), introns, exons and a 3' UTR, The number of genes in a genome can be estimated 
firom the number of mRNA molecules expressed by the cell or organism, or by 

20 extrapolation of a well characterized portion of the genome. When the genome of the 

organism of interest has been sequenced, the number of open reading firames (ORFs) can be 
determined and mRNA coding regions identified by analysis of the DNA sequence. For 
example, the genome of Saccharomyces cerevisiae has been completely sequenced and is 
reported to have approximately 6275 ORFs encoding sequences longer the 99 anaino acid 

25 residues in length. Analysis of these ORFs indicates that there are 5,885 ORFs that are 
likely to encode protein products (Goffeau et al., 1996, Science 274:546-567). In contrast, 
the human genome is estimated to contain qiproximately 30000 to 130000 genes (see 
Crollius et al., 2000, Nature Genetics 25:235-238; Ewing et al., 2000, Nature Genetics 
25:232-234). Genome sequences for other organisms, including but not limited to 

30 Drosophila, C. elegans, plants, e.g., rice and Arabidopsis, and mammals, e.g., mouse and 
human, are also completed or nearly completed. Thus, in preferred embodiments of the 
invention, array set comprising probes for all genes in the genome of an organism is 
provided. 
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It will be appreciated that when a sample of target nucleic acid molecules, e.g., 
cDNA complementary to the KNf A of a cell is made and hybridized to a microarray under 
suitable hybridization conditions, the level of hybridization to the site in the array will 
reflect the prevalence of the corresponding complementary sequences in the sample. For 
5 example, when detectably labeled (e,g., with a fluorophore) cDNA is hybridized to a 
microarray, the site on the array corresponding to a nucleotide sequence that is not in the 
sample will have little or no signal (e.g., fluorescent signal), and a nucleotide sequence that 
is prevalent in the sample will have a relatively strong signal. The relative abundance of 
different nucleotide sequences in a sample may be determined by the signal strength pattern 

10 of probes on a microarray. 

Nucleic acids from samples from two different cells subjected to two different 
conditions can be hybridized to the binding sites of the microarray using a two-color 
protocol. In the case of drug responses, one cell sample is exposed to a drug and another 
cell sample of the same type is not exposed to the drug. The cDNA derived from each of 

15 the two cell types is differently labeled (e.g., with Cy3 and Cy5) so that they can be 

distinguished. In one embodiment, for example, cDNA from a cell treated with a drug (or 
having a mutation or a disease, etc.) is synthesized using a fluorescein-labeled dNTP, and 
cDNA from a second cell, not drug-exposed, is synthesized using a rhodamine-labeled 
dNTP. When the two cDNA molecules are mixed and hybridized to the microarray, the 

20 relative intensity of signal from each cDNA set is determined for each site on the array, and 
any relative difference in abundance of a particular gene detected. 

Li the example described above, the nucleic acid from the drug-treated cell will 
fluoresce green when the fluorophore is stimulated and the nucleic acid from the untreated 
cell will fluoresce red. As a result, when the drag treatment has no eflFect, either directly or 

25 indirectly, on the transcription of a particular gene in a cell, the expression patterns will be 
indistinguishable in both cells and, upon reverse transcription, red-labeled and green- 
labeled nucleic acids will be equally prevalent. When hybridized to the microarray, the 
binding site(s) for that species of nucleic acid will emit wavelengths characteristic of both 
fluorophores. In contrast, when the drug-exposed cell is treated with a drug that, directly or 

30 indirectly, change the transcription of a particular gene in the cell, the expression pattern as 
represented by ratio of green to red fluorescence for each binding site will change. When 
the drug increases the prevalmce of an mRNA, tiie ratios for each binding site of the 
n:iRNA will increase, whereas wh^ the drag decreases the prevalraice of an mRNA, the 
ratio for each for each binding site in the mRNA will decrease. 
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The use of a two-color fluorescence labeling and detection scheme to define 
alterations in gene expression has been described in connection with detection of mRNA 
molecules, e.g., in Shena et aL, 1995, Quantitative monitoring of gene expression patterns 
with a complementary DNA microanay. Science 270:467-470. An advantage of using 

5 cDNA labeled with two different fluorophores is that a dkect and internally controlled 
comparison of ttie mKNA or exon expression levels corresponding to each arrayed gene in 
two cell states can be made, and variations due to minor differences in e^qperimental 
conditions (e.g., hybridization conditions) will not affect subsequent analyses. However, it 
will be recognized that it is also possible to use cDNA from a single cell, and compare, for 

10 example, the absolute amount of a particular exon in, e.g., a drug-treated or pathway- 
perturbed cell and an untreated cell. Furthermore, labeling with more than two colors is 
also contemplated in the present invention. In some embodiments of the invention, at least 
5, 10, 20, or 100 dyes of different colors can be used for labeling. Such labelmg permits 
simultaneous hybridizing of the distinguishably labeled cDNA populations to the same 

1 5 array, and thus measuring, and optionally comparing the expression levels of, mRNA 

molecules derived from more than two samples. Dyes that can be used include, but are not 
limited to, fluorescein and its derivatives, rhodamine and its derivatives, texas red, 
5'carboxy-fluorescdn (FMA), 2^7'-dimefhoxy-4',5'-dichloro-6-carboxy-fluorescem (JOE), 
HNJ^^lsr-tetramethyl-e-carboxy-rhodaniine (TAMRA), 6'carboxy-X-rhodamine (ROJQ, 

20 HEX, TET, IRD40, and IRD41, cyamine dyes, including but are not limited to Cy3, Cy3.5 
and Cy5; BODIPY dyes including but are not limited to BODIPY-FL, BODBPY-TR, 
BODIPY-TMR, BODIP Y-630/650, and BODIPY-650/670; and ALEXA dyes, mcluding 
but are not limited to ALEXA.488, ALEXA-532, ALEXA-546, ALEXA-568, and ALEXA- 
594; as well as other fluorescent dyes which will be known to those who are skilled in the 

25 art. 

5.5.1.2 PREPARING PROBES FOR MICRO ARRAYS 

As noted above, the probe to which a particular polynucleotide molecule specifically 
hybridizes is a complementary polynucleotide sequence. Typically each probe on the 
microarray will be between 20 bases and 600 bases, and usually between 30 and 200 bases 
30 in length. 

The means for generating the polynucleotide probes of the microarray is by 
synthesis of synthetic polynucleotides or oligonucleotides, e.g., using N-phosphonate or 
phosphoramidite chemistries (Froehler et al., 1986, Nucleic Acid Res. 14:5399-5407; 
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McBiide et al., 1983, Tetrahedron Lett. 24:246-248). Synthetic sequences are typically 
between about 15 and about 600 bases in length, more typically between about 20 and about 
100 bases, most preferably between about 40 and about 70 bases in length. 

The probes on the microarrays are macromolecules attached to the solid support of a 
5 micioarray. In the present invaition, the probes are preferably nucleic acid sequences (or 
fragments thereof). 

5.5.1.3 ATTACHING PROBES TO THE SOLID SURFACE 
Methods of the invention utilize polynucleotide probes synthesized directiy on the 
support to form the array. The probes are attached to a solid siqyport or surface, which may 

10 be made, e.g., from glass, plastic (eg., polypropylene, njdon), polyacrylamide, 
nitrocellulose, gel, or other porous or noiq>orous mat^aL 

A mettiod for makmg microarrays is by making high-densily oligonucleotide arrays. 
ThCTe are a variety of techniques known for producing arrays containing thousands of 
oligonucleotides complementary to defined sequences, at defined locations on a sur&ce. 

15 For example, photolithographic techniques for synthesis in situ (see, Fodor et al., 1991, 
Science 251:767-773; Pease et al., 1994, Proc. Nati. Acad. Sci. U.S.A. 91:5022-5026; 
Lockhart et al., 1996, Nature BioTechnoIogy 14:1675; U.S. Patent Nos. 5,489,678; 
5,578,832; 5,556,752; 5,510,270; 6,197,506; and 6,346,413) or other methods for r^id 
synthesis and deposition of defined oligonucleotides (Blanchard et al.. Biosensors & 

20 Bioelectronics 1 1 :687-690) may be used. 

Other methods for making microarrays, e.g., by masking (Maskos and Soutiiem, 
1992, Nucl. Adds. Res. 20:1679-1684), may also be used. In principle, and as noted supra, 
any type of array, for example, dot blots on a nylon hybridization membrane (see Sambrook 
et al., supra) could be used. However, as will be recognized by those skilled in the art, very 

25 small arrays will frequently be preferred because hybridization volumes will be smaller. 
In a particularly preferred embodiment, microarrays of the invoition are 
manufactured by means of an ink jet printing device for oligonucleotide synthesis, e.g., 
using the methods and systems described by Blanchard in International Patent Publication 
No. WO 98/41531, published September 24, 1998; Blanchard et al., 1996, Biosensors and 

30 Bioelectronics 1 1 :687-690; Blanchard, 1998, in Synthetic DNA Arrays in Genetic 
Engineering, Vol. 20, J.K. Setlow. Ed., Plenum Press, New York at pages 1 1 1-123; 
Hughes et al., 2001, Nature BioTechnoIogy 19:342-347; and U.S. Patent No. 6,028,189 to 
Blanchard. Specifically, tiie oligonucleotide probes in such microarrays are preferably 
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synthesized in arrays, e.g., on a glass slide, by serially depositing individual nucleotide 
bases in microdroplets of a high surface tension solvent such as propylene carbonate. The 
microdroplets have small volumes (e.g., lOOpL or less, more preferably 50pL or less) and 
are separated fix>m each other on tiie microarray (eg., by hydrophobic domains) to forai 

5 circular surface tension wells which define the locations of the array elements (i.e., tiie 

different probes). Polynucleotide probes are attached to the surface covalently at the 3' end 
of the polynucleotide. 

When these methods are used, oligonucleotides {eg., 60-mers) of known sequence 
are synthesized directly on a surface such as a derivatized glass slide. The array produced 

10 can be redxmdant, with several oligonucleotide molecules per gene. 

5.5.1.4 TARGET POLYNUCLEOTTOE MOLECULES 
Target polynucleotides are the polynucleotides of the biological samples that are 
being processed on the microarray. Target polynucleotides can be RNA molecules such as, 
but by no means limited to messenger KNA (mKNA) molecules, ribosomal RNA (rKNA) 

1 5 molecules, cRNA molecules (i. e. , RNA molecules prepared from cDNA molecules that are 
transcribed in vitro) and fragments thereof. Additionally, target polynucleotides may also 
be, but are not limited to, DNA molecules such as genomic DNA molecules, cDNA 
molecules, and fragments thereof including oligonucleotides, ESTs, STSs, etc. In specific 
embodiments, the sample comprises more than 1000, 5000, 10000, 50000, 100000, 250000, 

20 or 1000000 nucleic acid molecules of different nucleotide sequences. 

The target polynucleotides may be from any source. For example, the target 
polynucleotide molecules may be naturally occurring nucleic acid molecules such as 
genomic or extragenomic DNA molecules isolated from an organism, or KNA molecules, 
such as mRNA molecules, isolated &om an organism. Alternatively, the polynucleotide 

25 molecules may be synthesized, including, e.g., nucleic acid molecules synthesized 

enzymatically in vivo or in vitro, such as cDNA molecules, or polynucleotide molecules 
synthesized by PGR, RNA molecules synthesized by in vitro transcription, etc. The sample 
of target polynucleotides can comprise, e.g., molecules of DNA, RNA, or copolymers of 
DNA and RNA. In preferred embodiments, the target polynucleotides of the invention will 

30 correspond to particular genes or to particular gene transcripts (e.g., to particular mRNA 
sequences expressed in cells or to particular cDNA sequences derived from such mRNA 
sequences). However, in many embodiments, particularly those embodiments wherein the 
polynucleotide molecules are derived &om mammalian cells, the target polynucleotides 
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may correspond to particular fragments of a gene transcript. For example, flie target 
polynucleotides may correspond to different exons of the same g^e, e.g., so that different 
splice variants of that gene may be detected and/or analyzed. 

In preferred embodiments, the target polynucleotides to be analyzed are prepared in 
5 ^ntro from nucleic acids extracted from cells. For example, in one embodiment, RNA is 
extracted from cells {e.g., total cellular RNA, poly(A)+ messenger RNA, fraction thereof) 
and messCTger RNA is purified from the total extracted RNA. Methods for preparing total 
and poly(A)+ RNA are well known in the art, and are described generally, e.g., in 
Sambrook et al., siipra. Li one embodiment, RNA is extracted from cells of the various 

10 types of interest m this invention using guanidinium thiocyanate lysis followed by CsCl 
centrifugation and an oligo dT purification (Chirgwin et al., 1979, Biochemistry 18:5294- 
5299). In another embodiment, total RNA is extracted from cells xxsing guanidinium 
thiocyanate lysis followed by purification on RNeasy columns (Qiagen). cDNA is then 
synthesized from the purified mRNA using, e.g., oligo-dT or random primers. In preferred 

15 embodiments, the target polynucleotides are cRNA prepared from cDNA prepared from 
purified mRNA or from total RNA extracted from cells. As used herein, cRNA can either 
be complementary to (anti-sense) or of the same sequence (sense) as the sample RNA. The 
extracted RNA molecules are amplified using a process in which double-stranded cDNA 
molecules are synthesized from the sainple RNA molecules using primers linked to an 

20 RNA polymerase promoter. As a result, RNA polymerase promoters can be incorporated 
into eitiier or both strands of the cDNA Using the RNA polymerase promoter that is on tiie 
first strand of the cDNA molecule, cRNA can be syntiiesized that is the same sequence as 
the sample RNA. To synthesize cRNA complraientary to the sample RNA, transcription 
can be initiated from the RNA polymerase promoter that is on the second strand of the 

25 double-stranded cDNA molecule using an RNA polymerase (see, e.g., U.S. Patent Nos. 
5,891,636, 5,716,785; 5,545,522 and 6,132,997; see also, U.S. Patent No. 6,271,002 and 
U.S. Provisional Patent Application Serial No. 60/253,641 , filed on November 28, 2000, by 
Ziman et al.). Both oligo-dT primers (U.S. Patent Nos. 5,545,522 and 6,132,997) or 
random primers (U.S. Provisional Patent Application Serial No. 60/253,641, filed on 

30 November 28, 2000, by Ziman et al.) that contain an RNA polymerase promoter or 

complement thereof can be used. Preferably, the target polynucleotides are short and/or 
fragmented polynucleotide molecules which are representative of the original nucldc acid 
population of the cell. In one embodiment, total RNA is used as iiq)ut for cRNA syntiiesis. 
An oligo-dT primer containing a T7 RNA polymerase promoter sequence can be used to 
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prime first strand cDNA synthesis. When second strand synthesis is desired, random 
hexamers can be used to prime second strand cDNA synthesis by a reverse transcriptase. 
This reaction yields a double-stranded cDNA that contains the T7 RNA polymerase 
promoter at the 3' end. The double-stranded cDNA can then be transcribed into cRNA by 

5 T7 RNA polymerase. 

The target polynucleotides to be analyzed are preferably detectably labeled. For 
exanq>le, cDNA can be labeled directly, e.g.. with nucleotide analogs, or indirectly, eg., by 
makmg a second, labeled cDNA strand using the first strand as a tenq>late. Alternatively, 
the double-stranded cDNA can be transcribed into cRNA and labeled. 

10 Preferably, the detectable label is a fluorescent label, e.g., by incorporation of 

nucleotide analogs. Oflier labels suitable for use in the present invention include, but are 
not limited to, biotm, imminobiotin, antigens, cofactora, dinitrophenol, lipoic acid, olefinic 
compounds, detectable polypeptides, electron rich molecules, enzymes enable of 
generating a detectable signal by action upon a substrate, and radioactive isotopes. 

15 Preferred radioactive isotopes include ^^P, ^^S, ^'^C, and ^^I. Fluorescent molecules 
suitable for the present invention include, but are not limited to, fluorescein and its 
derivatives, rhodamine and its dmvatives, texas red, 5'carboxy-fluorescem (FMA), 2',7'- 
dimeflioxy-4',5'-dichloio-6-caiboxy-fluorescein (JOE), N,N,NP,lSr-tetramethyl-6-cari)Oxy- 
ihodamine (TAMRA), 6'carboxy-X-ihodamine (ROX), HEX, TET, IRD40, and IRD41. 

20 Huorescent molecules that are suitable for the mventionfiirthra: include: cyaminedyes, 

including by not Umited to Cy3, Cy3.5 and Cy5; BODIPY dyes including but not limited to 
BODIPY-FL, BODIPY-TR, BODIPY-TMR, BODIPY-630/650, and BODIPY-650/670; 
and ALEXA dyes, mcluding but not limited to AIJBXA-488, ALEXA-532, ALEXA-546, 
ALEXA-568, and ALEXA-594; as well as other fluoiescait dyes wMch wiU be known to 

25 those who are skilled in the art. Electron rich indicator molecules suitable for the present 
invention include, but are not limited to, ferritin, hemocyanin, and colloidal gold. 
Alternatively, in less preferred embodiments the target polynucleotides may be labeled by 
specifically coniplexing a first group to the polynucleotide. A second group, covalentiy 
linked to an indicator molecules and which has an affinity for the first group, can be used to 

30 indirecfly detect the target polynucleotide. In such an embodiment, compounds suitable for 
use as a first group include, but are not limited to, biotin and imminobiotm. Conq)ounds 
suitable for use as a second groiq> include, but are not limited to, avidin and streptavidin. 

The binding partners of the predetermined binding sequence of tiie quality control 
probes can be added to the target molecules prior to contact with the microaxray. In one 
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embodiment, the binding partners are added to the target molecules during target molecule 
processing. In a more preferred embodiment, the binding partners are added tot he target 
molecules immediately prior to contacting tiie microarray. 

5.5.1.5 HYBRIDIZATION TO MICROARR4YS 
5 As described supra, nucleic acid hybridization and wash conditions are chosen so 

that the polynucleotide molecules to be analyzed (or target polynucleotide molecules) 
specifically bind or specifically hybridize to the complementary polynucleotide sequaices 
of the array, preferably to one or more specific array sites, wherem its complementary 
sequence is located. 

10 Arrays containing double-stranded probe DNA situated thereon are preferably 

subjected to denaturing conditions to r^der the DNA single-stranded prior to contacting 
with the target polynucleotide molecules. Arrays containing single-stranded probe DNA 
(e.g.. synthetic oligodeoxyribonucleic acids) may need to be denatured prior to contacting 
with the target polynucleotide molecules, e.g.. to remove hairpins or dimers which form due 

15 to self complementary sequraices. 

Optimal hybridization conditions Avill depend on the Iraigth (e.g.. oligomer versus 
polynucleotide greater than 200 bases) and type (e.g., RNA, or DNA) of probe and target 
nucleic acids. General parameters for specific (i,e., stringent) hybridization conditions for 
nucleic acids are described in Sambrook et al., (supra), and in Ausubel et al., 1987, Current 

20 Protocols in Molecular Biology, Greene Publishing and Wiley-Interscience, New York. For 
exainple, when cDNA microarrays are used, typical hybridization conditions are 
hybridization in 5 X SSC plus 0.2% SDS at 65 "C for four hours, followed by washes at 25 
«C in low stringency wash buffer (1 X SSC plus 0.2% SDS). followed by 10 minutes at 25 
"C in higher stringency wash buffer (0.1 X SSC plus 0.2% SDS) (Hughes et al., 2001, 

25 Nature BioTechnology 19:342-347). Useful hybridization conditions are also provided in, 
e.g., Tijessen, 1993, Hybridization With Nucleic Acid Probes, Elsevier Science Publishers 
B.V. and Kricka, 1992, Nonisotopic DNA Probe Techniques, Academic Press, San Diego, 
CA. 

Particularly preferred hybridization conditions for use with the screening and/or 
30 signaling chips of the present invention include hybridization at a temperature at or near the 
mean melting temperature of the probes (e.g.. within 5°C, more preferably within 2°C) in 
IM NaCl, 50mM MES buffer (pH 6.5), 0.5% sodium Sarcosine and 30% fonnamide. 
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5,5.1.6 SIGNAL DETVrTION AND DATA ANALYSIS 
It will be appreciated that when target sequences, e.g., cDNA or cRNA, 
complementary to the RNA of a cell is made and hybridized to a microarray under suitable 
hybridization conditions, the level of hybridization to flie site in the array corresponding to a 
particular gene will reflect the prevalence in tihe cell of mRNA or mKNA molecules 
containing the transcript fix>m that gene. For example, when detectably labeled (e.g., with a 
fluorophore) cDNA complementary to the total cellular mRNA is hybridized to a 
microarray, the site on the array corresponding to a gene (ia, capable of specifically 
binding the product or products of the gene expressing) that is not transcribed in the cell 
will have little or no signal (e.g., fluorescent signal), and a gene for which the encoded 
mRNA expressing the transcript is prevalent will have a relatively strong signal. 

In preferred embodiments, target sequences, e.g., cDNA molecules or cRNA 
molecules, fi"om two different cells are hybridized to the binding sites of the microarray. In 
the case of drug responses one cell sample is exposed to a drug and another cell sample of 
the same type is not exposed to the drug- In the case of pathway responses one cell is 
exposed to a pathway perturbation and another cell of the same type is not exposed to the 
pathway perturbation. The cDNA or cRNA derived fix>m each of the two cell types are 
differently labeled so that fliey can be distinguished. In one embodiment, for example, 
cDNA &om a cell treated with a drag (or otherwise perturbed) is synthesized using a 
fluorescein-labeled dNTP, and cDNA ftom a second cell, not drag-exposed, is synthesized 
using a rhodamine-labeled dNTP. When the two cDNA molecules are mixed and 
hybridized to the microarray, the relative intensity of signal fiom each cDNA set is 
determined for each site on the array, and any relative difference in abundance of a 
particular transcript detected. 

In the example described above in the previous paragraph, the cDNA from the drag- 
treated (or othCTwise perturbed) cell will fluoresce green when the fluorophore is stimulated 
and the cDNA from the imtreated cell will fluoresce red. As a result, when the drag 
treatment has no effect, either directiy or indirectly, on the transcription of a particular gene 
in a cell, the expression pattern will be indistinguishable in both cells and, upon reverse 
transcription, red-labeled and green-labeled cDNA will be equally prevalent. When 
hybridized to the miax)array, the binding site(s) for that species of RNA will emit 
wavelengths characteristic of both fluorophores. In contrast, when the drag-exposed cell is 
treated with a drag that, directiy or indirectiy, changes the transcription splicing of a 
particular gene in the cell, the e}q>ression pattern as represented by ratio of green to red 
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fluoresc^ce for each transcript binding site will change. When the drug increases the 
prevalence of an inRNA, the ratios for each transcript fragment expressed in the mRNA will 
increase, whereas when the drug decreases the prevalence of an naKNA, the ratio for each 
exons expressed inthe niRNA will deorease. 

5 The use of a two-color fluorescence labeling and detection scheme to define 

alterations in gene eitpression has been described in connection with detection of mRNA 
molecules, e.g., in Shena et al., 1995, Quantitative monitoring of gene expression patterns 
with a complementary DNA microarray. Science 270:467-470. An advantage of using 
target sequences, e.g., cDNA molecules or cRNA molecules, labeled with two different 

10 fluorophores is that a direct and internally controlled comparison of the mRNA expression 
levels corresponding to each arrayed gene in two cell states can be made, and variations due 
to minor differences in experimental conditions (e.g., hybridization conditions) will not 
affect subsequent analyses. However, it will be recognized that it is also possible to use 
cDNA from a single cell, and compare, for example, the absolute amount of a particular 

15 exon in, e.g., a drug-treated or otherwise perturbed cell and an untreated cell. 

In other preferred embodiments, single channel detection methods, e.g., using one- 
color fluorescence labeling, are used (see U.S. Patent Application Serial No. 09/781,814, 
filed on February 12, 2001). In this embodiment, arrays comprising reverse-complement 
(RC) probes are designed and produced. Because a reverse complement of a DNA 

20 sequence has sequence complexity that is equivalent to the corresponding forward-strand 
(FS) probe fliat is complementary to a target sequence with respect to a variety of measures 
(e.g., measures such as GC content and GC trend are invariant under flie reverse 
complement), a RC probe is used to as a control probe for determination of level of non- 
specific cross hybridization to the corresponding FS probe. The significance of the FS 

25 probe intensity of a target sequence is determined by comparing the raw intensity 

measurement for the FS probe and the corresponding raw intensity measurement for the RC 
probe in conjunction with the respective measurement errors. In a preferred embodiment, a 
transcript is called present if the intensity difference between the FS probe and the 
corresponding RC probe is significant. More preferably, a transcript is called present if the 

30 FS probe intensity is also significantly above background level. Siag^e channel detection 
methods can be used in conjunction with multi-color labeling. In one embodiment, a 
plurality of different samples, each labeled with a dififerent color, is hybridized to an array. 
Differences between FS and RC probes for each color are used to determine the level of 
hybridization of the corresponding sample. 
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When fluorescently labeled probes are used, the fluorescence emissions at each site 
of a transcript array can be, preferably, detected by scanning confocal laser microscopy. In 
one embodiment, a separate scan, using the appropriate excitation line, is carried out for 
each of the two fluorophores used. Altemadvely, a laser can be used that allows 

5 simultaneous specimen illumination at wavelengflis specific to the two fluorophores and 
emissions firom the two fluorophores can be analyzed simultaneously (see Shalon et al., 
1996, Genome Res. 6:639-645). In a preferred embodiment, the arrays are scanned with a 
laser fluorescence scanner with a computer controlled X-Y stage and a microscope 
objective. Sequential excitation of the two fluorophores is achieved with a multi-line, 

10 mixed gas laser, and the emitted light is split by wavel^igtti and detected with two 

photomultiplier tubes. Such fluorescence laser scanning devices are described, e.g., in 
Schena et al., 1996, Genome Res. 6:639-645. Alternatively, the fiber-optic bundle 
described by Ferguson et al., 1996, Nature BioTechnology 14: 1681-1684, may be used to 
monitor mRNA abundance levels at a large number of sites simultaneously. 

1 5 Signals are recorded and, in a preferred embodiment, analyzed by computer, e.g,, 

using a 12 bit or 16 bit analog to digital board. In one embodiment, the scanned image is 
despeckled using a graphics program {e.g., Hijaak Graphics Suite) and then analyzed using 
an image gridding program that creates a spreadsheet of the average hybridization at each 
wavelength at each site. If necessary, an expmmentally detenxiined correction for cross 

20 talk (or overlap) between the channels for the two fluors may be made. For any particular 
hybridization site on tihe transcript array, a ratio of the emission of the two fluorophores can 
be calculated. The ratio is independent of the absolute expression level of the cognate g©ie, 
but is usefiil for genes whose expression is significantly modulated by drug administration, 
gene deletion, or any other tested event. 

25 The relative abundance of an mRNA in two cells or cell lines is preferably scored as 

perturbed (i.e., the abundance is different in the two sources of mRNA tested) or as not 
perturbed (i.e, the relative abundance is the same). As used herein, a difference between 
the two sources of RNA of at least a factor of about 25% 0*.e., RNA is 25% more abundant 
in one source than in the other source), more usually about 50%, even more often by a 

30 factor of about 2 (i.e., twice as abundant), 3 (three times as abundant), or 5 (five times as 
abundant) is preferably scored as a perturbation. 

It is, however, also advantageous to determine the magnitude of the relative 
difference in abundances for an mRNA caressed in an mRNA in two cells or in two cell 
lines. This can be carried out, as noted above, by calculating the ratio of the emission of the 



37 



wo 2004/003233 ^^PCT/US2003/020504 

two fluorophores used for differential labeling, or by analogous methods that will be readily 
apparent to those of skill in the art. 

5.5.2 PROTEIN MICROARRAYS 

In an embodiment in the present invention, the niicroarray is a protem microairay. 

5 As a result, the quality control probe in this embodiment is a polypeptide or peptide. 
Protein quality control probes preferably have a corresponding binding partner available 
such that contacting the probe with said binding partner can allow for specific and 
quantifiable binding. 

On a protein microarray, protein probes possessing the ability to bind proteins of 

10 interest are immobilized to the surface of a substrate, e.g., a solid support (see e.g., Goffeau 
et al., 1996, Science 274:546-567; Aebersold et al., 1999, Natt4te BioTechnology 10:994- 
999; Haab et al., 2001, Genome Biology 2:RESEARCH0004.1-RESEARCH0004.13; U.S. 
Patent No. 6,346,413). For example, polypeptide probes may be prepared using standard 
solid-phase techniques for the synthesis of peptides. As is generally known, polypeptides 

15 can be prepared using commercially available equipment and reagrats following flie 

manufacturers' instructions for blocking interfering groups, protecting the amino acid to be 
reacted, coupling, deprotection, and capping of unreacted residues. The protein probes may 
contain non-peptide linkages and/or modified or non-naturally occurring amino acids, e.g., 
D-amino acids, phosphorous analogs of amino acids, such as oc-amino phosphoric acids and 

20 j3-amino phosphoric acids. 

The probes used in the methods of the invention are preferably synthesized on a 
solid support or surface which may be either porous or non-porous. For example, the 
probes of the invention may be polypeptide sequences which are attached to a nitrocellulose 
or nylon membrane or filter. Alternatively, the solid support or surface may be a glass or 

25 plastic surface. 

Proteins can be synthesized on a positionally addressable array with a plurality of 
proteins attached to a substrate, with each protem being at a diflferent position on the solid 
support. Preferably, the plurality of proteins comprises at least 10, 50, 100, 250, 500, 1000, 
1500, 2000, 2500, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 50000, or 100000 

30 different polypeptides expressed in a single biological sample, plus the quahty control 

probes. Protein probes are synthesized onto the substrate in a step-by-step synthesis using 
amino acid monomers. 
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In one embodiment, the quality control probe is an antibody or fragment thereof. In 
another embodiment, the binding partner of the quaKty control probe is an antibody or 
fragment thereof. In a preferred embodiment, the antibody is a monoclonal antibody or 
fragment (e.g.. Fab fragment) thereof (see, e.g., Zhu et al., 2001, Science 293:2101-2105; 
MacBeath et al., 2000, Science 289:1760-63; de Wildt et al,, 2000, Nature BioTechnology 
18:989-994). 

It will be appreciated that when a sample of protein is bound to a protein microarray 
under suitable conditions, the level of binding to a particular site in the array will reflect the 
prevalence of tiie corresponding binding partner in the sample. The level of binding 
Ijet^eeji polypeptide quality control probe on the microarray and its protein binding partner 
is preferably indicated by signaling compounds. For example, when a proteia sample is 
bound to a protein microarray, the site on the array corresponding to a polypeptide probe 
with a corresponding binding partner not in die sample will have little or no signal, and a 
polypeptide probe with a corresponding binding partner that is prevalent in tiie sample will 
have a relatively strong signal. The relative abundance of different proteins in a sample 
may be detemiined by the signal strengfli pattern of probes on a microarray. In one 
embodiment, one or more signal compounds {e.g,, fluorescent dyes) are directly attached to 
the protein binding partner of flie quality control probe. In another embodunent, one or 
more signal compounds are attached to Ihe protein binding partner of flie quality control 
probe indirectly {e.g., through the use of a fluorescently labeled antibodies). 

5.6 IMPLEMENTATION SYSTEMS AND METHODS 
The analytical methods of the present invention can preferably be implemented 
using a computer system, such as the computer system described in this section, according 
to the following programs and methods. Such a computer system can also preferably store 
and manipulate a database of the present invention which comprises a compendium of 
positional information pertaining to the location of quality control probes on the microarray 
as well as in which sequential cycles of synthesis they were synthesized (i.e., the vertical 
placement in the microarray) and which can be used by a computer system in inq>lementing 
the analytical methods of this invention. Accordingly, such computer systems are also 
considered part of the present invention. In a specific embodiment, the quality control 
positional information is stored in digital form in a database. 

In a specific embodiment, the compute: system comprises one or more processing 
units and one or more memory units coimected to said one or more processor units. Said 
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one or more memory miits contain one or more programs which cause said one or more 
processor units to execute st^s of comparing the binding to their binding partner of two or 
more of the quality control probes on an array of the invention. The result is output, 
preferably as a binding ratio of the quality control probes. In a specific embodiment, the 
computer programs cause said one or more processors to execute steps of 

(a) receiving a first data structure comprising flie binding intensity of the quality 
control probes on the processed microarray; and 

(b) comparing said first data structure to a plurality of data stmctures in a 
database, each data structure comprising positional information regarding the quality 
control probes associated with said microarray, to identify the relevant positions on the said 
microarray to compare to assess synthesis integrity; and 

(c) comparing the binding of two or more quality control probes. 

In a specific embodiment, the computer system comprises a program that causes the 
processor to compare the appropriate quality control probe binding intensities and thereby 
determine if the microarray was synthesized correctly. 

In another embodiment, the computer system performs one or more aspects of the 
sample quality control. For example, the computer can read the microarray* s quality 
control probe intensities directly firom the raw data represented in a TIFF file of the scanned 
microarray image and compare the impropriate intensities, and determine if the synthesis of 
the array is defective, thus resulting in suspect data. If a syntiiesis defect is identified, the 
computer could generate a non-conformance report and refirain firom automatically adding 
the suspect data to the database containing microarray possessing data until the quality 
control issues are fiirther addressed. In one embodiment, the computer would generate a 
non-coufoimance report if the binding ratio of the quality control probes is not between 0.5 
and 2.0. 

An exemplary computer system suitable for implementing the analytic methods of 
this invOTtion preferably comprises internal components being linked to external 
components. The internal components of this computer system include a processor element 
interconnected with a main memory. For example, the computer system can be an Intel 
Pentium®-based processor of 200MHZ or greater clock rate and with 32 MB or more main 
memory. In a preferred embodiment, the computer system is a cluster of a plurality of 
computers comprising a head "node" and eight sibling "nodes", with each node havmg a 
central processing unit (CPU). In addition, the cluster also comprises at least 128MB of 
random access memory (RAM) on the head node and at least 2S6MB of RAM on each of 
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the eight sibling nodes. Therefore, the computer systems of the present invention are not 
limited to those consisting of a single memory unit or a single processor unit 

The external components can include a mass storage. This mass storage can be one 
or more hard disks that are typically packaged together with the processor and memory. 
5 Such hard disk are typically of 1 GB or greater storage capacity and more preferably have at 
least 6GB of storage capacity. For example, in a preferred embodiment, described above, 
wherein a computer system of the invention comprises several nodes, each node can have 
its own hard drive. The head node preferably has a hard drive with at least 6GB of storage 
capacity whereas each sibling node preferably has a hard drive with at least 9GB of storage 
10 capacity. A computer system of the invention can further comprise oflier mass storage units 
including, for example, one or more floppy drives, one more CD-ROM drives, one or more 
DVD drives or one or more DAT drives. 

Other external components typically include a user interface device, which is most 
typically a monitor and a keyboard together with a graphical input device such as a 
1 5 "mouse". The computer system is also typically linked to a network link which can be, e.g., 
part of a local area network (LAN) to other, local computer systems and/or part of a wide 
area network (WAN), such as the hitemet, that is coxmected to other, remote computer 
systems. For example, in the prefored embodiment, discussed above, wherein the 
computer system comprises a plurality of nodes, each node is preferably connected to a 
20 network, preferably an NFS network, so that the nodes of the computer system 

communicate witii each other and, optionally, with other conq)uter systems by means of the 
network and can thereby share data and processing tasks with one another. 

Loaded into memory during operation of such a computer system are several 
software components. The software components comprise both software components that 
25 are standard in the art and components that are special to the present invention. These 

software components are ^ically stored on mass storage such as the hard drive, but can be 
stored on other computer readable media as well including, for example, one or more floppy 
disks, one or more CD-ROMs, one or more DVDs or one or more DATs. The software 
component represents an operating system which is responsible for managing the computer 
30 system and its network interconnections. The operating system can be, for example, of tiie 
Microsoft Windows™ femily such as Windows 95, Window 98, Windows NT or 
Windows2000. Alternatively, the operating software can be a Macintosh operating sj^tem, 
a UNIX operating system or the LINUX operating The software components 

comprise common languages and ftmctions that are preferably present in the system to 
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assist programs implementing methods specific to the present invention. Languages that 
can be used to program the analytic methods of the invention include, for example, C and 
C-H-, FORTRAN, PERL, HTML, JAVA, and any of the UNIX or LINUX shell command 
languages such as C shell script language. The methods of the invention can also be 

5 programmed or modeled in mathematical software packages that allow symbolic entry of 
equations and high-level specification of processing, including specific algorithms to be 
used, thereby fireeing a user of the need to procedurally program individual equations and 
algorithms. Such packages include, e.g., Matlab firom Mathworks (Natick, MA), 
Mathematica from Wolfiram Research (Champaign, XL) or S-Plus &om MathSoft (Seattle, 

10 WA). 

The software component comprises analytic methods of the present invention, 
preferably programmed in a procedural language or symbolic package. For example, the 
software component preferably includes programs that cause the processor to implement 
steps of accepting a plurality of positional data for each quality control probe on each 
15 microarray and storing the data in the memory. For example, the computer system can 

accept data manually entered by a user (e.g., by means of the user interface). Alternatively, 
however, the programs cause the computer system to retrieve quality control probe 
information fix>m a database. Such a database can be stored on a mass storage (e.g., a hard 
drive) or oth^ computer readable medium and loaded into the memory of tiie computer, or 
20 the database can be accessed by the computer system by means of the netwoik. 

In one embodiment, the computer readable medium contains an encoded data 
structure comprising: 

(a) a digital representation of the position of the quality control probes on the 
microarray; and 

25 (b) a digital representation of the cycles of synthesis at which each quality 

control probe was synthesized. 

In another embodiment, control microarrays with intentional defects can be 

processed and signal intensity patterns and ratios can be stored. The present invention also 

encompasses a process by which the signal intensity(ies) and/or resulting ratios firom the 
30 sample microarray are compared to the database containing a compendimn of known errors. 

Should a match be found in Ihe database, the defect in the sample microarray can be 

determined. 

In addition to the exemplary program structures and computer systems described 
herein, other, alternative program structures and computer systems will be readily apparent 
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to the skilled artisan. Such alternative systems, which do not depart from the above 
described computer system and programs structures eiflier in spirit or in scope, are therefore 
intended to be comprehended within the accompanying claims. 

The following examples are presented by way of illustration of the presrat 
invention, and are not intended to limit the present invention in any way. 

6. EXAMPLE 1; QUALITY CONTROL USI NG QUALITY CONTROL 
PROBES 

6.1 Demonstration of Synthesis Error 

The inkjet writer uses two Inkjet heads for distributing phosphoramidites or 
activator onto the glass substrate of tiie array. Each head contains tiiree sets of 20 nozzles 
with each 20-nozzle set dedicated for depositing either a single phosphoramidite or tiie 
activator. The 20 nozzles in a set are arranged in two interlaced columns often (see FIG. 
1). This pattem allows for the deposition of 20 rows of bases per pass of the inkjet heads, 
with each nozzle being responsible for a single row. Because each nozzle is responsible for 
a particular row, any clog or other nozzle malfunction can result in all or a portion of rows 
being deleted or synthesized inefficiently (detected by a reduction of intensity in the 
affected quality control probes) with a 20 row periodicity, FIG. 1 shows a 25,000 
oligonucleotide array synthesized with three clogged nozzles (i.e., nozzles 4, 15, and 20). 

6.2 Svnthesis Failure Detection 

Silted quality control probes are depicted schematically in FIG. 2A, A 25 
nucleotide predetermined binding sequence (depicted by a solid line) is synthesized either 
directiy on tiie microarray (so that the sequence is made at synthesis cycles 1-25) or on 
spacers (depicted by a dashed line). The spacer are shown to be either 20 nucleotides long 
(so that the sequence is made at synthesis cycles 21-45) or 35 nucleotides long (so that the 
sequence is made at synthesis cycles 36-60). Should there be no synthesis defects during 
oligonucleotide mioxiarray synthesis, then the reverse complement of the predetermined 
binding sequence should hybridize equally well to the predetermined binding sequence on 
all of the quality control probes and give comparable signals. 

FIG. 2B schematically depicts a syntihesis defect in synthesis cycle 24 of the 
oligonucleotide microarray (depicted by the striped bar). Because this affects the sequence 
of the predetermined binding sequence when it is either on no spacer or on a 20 nucleotide 
spacer, hybridization to its reverse complement will be decreased when compared to the 
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level of binding that is observed with no synthesis error. The predetermined binding 
sequence on a 35 nucleotide spacer is unaffected; however, thus it should hybridize to its 
reverse complement to the same degree as when no synthesis error was present. 

A quality control probe having the sequence of SEQ ID NO:l was synthesized on an 
Inkjet oligonucleotide microarray with either no spacer (with total length of 25 
nucleotides), on a 20 oligonucleotide spacer (with total length of 45 nucleotides), or on a 35 
oligonucleotide spacer (with total length of 60 nucleotides). 

5' ATCATCGTAGCTGGTCAGTGTATCC 3' (SEQ ID NO:l) 

The fluorescently labeled reverse complement of SEQ ID NO:l was used to 
hybridize to the oligonucleotide microarray. When there were no synthesis defects during 
oligonucleotide microarray synthesis, all of the quality control probes hybridized to then: 
reverse complement equally well (FIG. 4). This was shown by the comparable levels of 
hybridization to a fluorescently labeled reverse complementary nucleotide after microarray 
processmg (see FIGS, 4A-4B). Data quantifying fluorescent mtensity for each quality 
control probe was done in duplicate on two microarrays and is given in Table 1 . Ratios of 
average fluorescent intensity of the 25mer to the average fluorescent intensity of the 45mer 
or 60mer approach 1 and indicates that all bound to their reverse complement comparably. 

Similar experiments were conducted with various synthesis cycles being defective 
during microarray synthesis in order to ascertain the sensitivity of the quality control 
probes. When the first (FIG. 5) or first and second (FIG. 6) synthesis cycles were skipped 
during synthesis, only the 25mer hybridization to its complementary fluorescently labeled 
oUgonucleotide was affected OFIGS. 5A-5B and 6A-6B). Both ratios in Table 1 show a 
decrease with respect to ratios seen when no synthesis cycles are skipped. When the thirty 
sixth (FIG. 7) or thirty fourth and thirty fifth (FIG. 8) synthesis cycles were skipped, both of 
the 45mer and 60mer hybridization to their complementary fluorescently labeled 
oUgonucleotides were affected (FIGS. 7A-7B and 8A-8B). Both ratios in Table 1 show an 
increase with respect to ratios seen when no synthesis cycles are skipped. When there was 
inefficient synthesis in the first twenty two synthesis cycles (FIG. 9), only the 25mer 
hybridization to its complementary fluorescently labeled oUgonucleotide was severely 
affected (FIG. 9A-9B). Both ratios in Table 1 show a decrease with respect to ratios seen 
when no synthesis cycles are skipped or inefficient. 
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7. EXAMPLE 2; QUALITY rONTROLUSING QUALI TY CONTROL 
PROBES WITH NO SPACERS 

7.1 Synthesis Failare Detection 

Stagg^ed start quality control probes are depicted.schematically in FIG. 3 A, A 

5 series of 25 nucleotide predetermined binding sequences (depicted by a bold line) are 
synthesized directly on the microarray, with the synthesis individual probe(s) starting at 
every synthesis cycle (from synthesis cycle 1-36), Unlike the above strategy, no spacers are 
used so that all of the quality control probes are made up exclusively of predetermined 
binding sequence that are 25 oligonucleotides long. The only difference between the 

10 quality control probes is the cycle at which synthesis begins (the bold line depicts the 
quality control probe and the thin line depicts synthesis cycles that had no monomer 
deposited). The synthesis cycles that make up each quality control probe are listed above 
each probe in FIG. 3 A. Should there be no synthesis defects during oligonucleotide 
microarray synthesis, then the reverse complement of the probe sequence should hybridize 

15 equally well to all of the quality control probes and give comparable signals. 

FIG. 3B schematically depicts a synthesis defect in synthesis cycle 29 of the 
oligonucleotide microarray (depicted by the gray bar). Because ttns affects all of the 
predetermined binding sequences that have synthesis cycle 29 as part of their sequence (i.e., 
those quality control probes that begin at synthesis cycles 5-29), hybridization of the reverse 

20 complement will be decreased in these quality control probes when compared to the level of 
binding that is observed with no synthesis error. Quality control probes that do not contain 
a monomer deposited during synthesis cycle 29 (i.e., those quality control probes that begin 
synthesis at cycles 1-4 or 30-35) are unaffected, however, and thus they should hybridize to 
their reverse complement to the same degree as when no synthesis error was present. 

25 A quality control probe having the sequence of SEQ ID NO:l was synthesized on an 

ink jet oligonucleotide microarray using a staggered start. The quality control sequence was 
started at every progressive synthesis cycle from 1 to 35 during the synthesis of the 
microarray. The fluorescently labeled reverse complement of SEQ ID NO:l was used to 
hybridize to the oligonucleotide microarray. 

30 When there was inefficient synthesis in the first and second synthesis cycles during 

oligonucleotide microarray synthesis, only tiie first two staggered start quality control 
probes were affected (FIG. 10). The mean fluorescence intensity of the quality control 
probes at each synthesis cycle was plotted and showed a decrease in intensity only at probes 
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that contained part of their quality control probe sequence at the first and/or second 
synthesis cycles of the microarray (FIG. lOB). All of the quality control probes that had 
synthesis that started subsequent to the second synthesis cycle were unaffected and 
hybridized to their reverse complement equally well. Similar results were seen when there 
was inefficient synthesis in the first five synthesis cycles (FIG. 1 1), the first eight synthesis 
cycles (FIG. 12), or the last fifteen synthesis cycles (FIG. 13) during oligonucleotide 
microairay synthesis. In each case, fluorescent intensity decreased only for quality control 
probes that had monomers that contributed part of the sequence deposited at the affected 
synthesis cycles of the microarray, 

8. EXAMPLE 3: INCREASED SENSITIVITY OF QUAL ITY CONTROL 
PROBES 

8.1 Using Deletions 

A synthesis failure during oligonucleotide microarray synthesis such that one or 
more synthesis cycles are comprondsed decreases the degree of binding of the quality 
control probe to its fluorescently labeled reverse complementary oligonucleotide (e.g., see. 
Sections 6.2 and 7.1 above). However, in instances where only a small number of synthesis 
cycles are compromised (i.e., one or two) such that the quality control probe is now slightly 
less than full length (/.e., a 24mer or 23mer relative to a full length 25mer), bmding to its 
reverse complementary oligonucleotide can still be relatively robust In order to increase 
the sensitivity of synthesis failure detection, quality control probes with predetermined 
binding sequences already containing a single deletion were used in the methods of the 
invention. Such quality control probes had a predetermined binding sequence synthesized 
with a deletion m the nineteenth residue (from the 5' end) of SEQ ID NO:l. Any additional 
deletions due to a failure during microarray synthesis would exacerbate tiie defect and result 
in an increased deficiracy in the ability to bind to the reverse complement of the full length 
25mer sequence. FIG. 14 shows that a single-deletion quality control probe on a microarray 
with synthesis defects in the thirty fourth and thirty fifth synthesis cycles is more sensitive 
than a quality control probe with no deletions. 

5' ATCATCGTAGCTGGTCAGGTATCC 3' (SEQ ID NO:2) 
Labeled reverse complement of the full-length 25 nucleotide predetermined binding 
sequence was used to hybridize with quality control probes on each microarray. The mean 
fluorescence intensity plot of the quality control probes at each synthesis cycle was 
determined for each microairay. The full length quality control probe shows a synthesis 
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defect starting at the fifteenth synthesis cycle (FIG. 14A). The single-deletion quality 
control probe shows a synthesis error starting at the eleventh synthesis cycle (FIG. 14B). 
Thus the single-deletion quality control probe is a more sensitive measure of microarray 
quality. 

8.2 In Comparison With Correlation Plots 

These experiments show that using microarrays that contain one or more defects can 
provide data that, on the surface, looks acceptable. However, when the data is compared to 
data fix)m microarrays with no defects, the problems become apparent. Correlation plots 
assess the quality of the data by examining the reproducibility of an experiment (e.g., using 
fluor-reversed pair analysis). Correlations between fluor reversed pairs were plotted for 
microarrays that had defects in the first twenty two synthesis cycles (FIG. 15A) and 
microairays that had no synthesis defects (FIG. 15B). Oligonucleotides were labeled with 
either red or green fluorescent dye and a mixture was used to hybridize to each microarray. 
The loglO of the ratio of red to green fluorescent signal was plotted against flie loglO of the 
ratio of red to green fluorescent signal for a duplicate chip. When data &om a microarray 
with the first 22 cycles of synthesis skipped was conq>ared to itself no problem was 
detected (FIG. 15 A). Similarly, when data fix>m a non-defective microarray was compared 
to itself no problem was detected (FIG. 15B). However, when data &om a microarray with 
the first 22 cycles of synthesis skipped was compared to the data from a non-defective 
microarray, there is a difference (FIG. 15C). Even a defective microarray will result in 
data. Because it is not known beforehand what the data should look like, the data from 
defective arrays may initially look acceptable. The use of quality control probes according 
to flie invention safeguards against using poor quality data. 

Similar e}q)eiiments were conducted wifli oligonucleotide microarrays that had the 
first (FIG. 16A), first and second (JPIG. 16B), thirty sixfli (FIG. 16C), or thirty fourth and 
thirty fifth (FIG. 16D) synthesis cycles skipped during synthesis. Data fix)m 
oligonucleotide hybridization to the defective microarrays were plotted against data from 
non-defective microarrays. Again the plots all look similar and no synthesis defect would 
have been detected. This demonstrates that analysis of microarrays with correlation plots is 
not sensitive enough to identify defective microarrays. 
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9. REFERENCES CITED 

All references cited herein are incorporated herein by reference in their entirety and 
for all purposes to the same extent as if each individual publication or patent or patent 
application was specifically and individually indicated to be incorporated by reference in its 
entirety for all purposes. 

Many modifications and variations of the present invention can be made without 
departing firom its spirit and scope, as will be s^parent to fliose skilled in the art. The 
specific embodiments described herein are offered by way of example only, and the 
invention is to be limited only by the temis of the appended claims along with the full scope 
of equivalents to which such claims are entitled. 
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1. A positionally addressable array comprising a substrate to which are attached 
a plurality of different biopolymer probes, said different biopolymer probes in said plurality 
being situated at different positions on said surface and being the product of a step-by-step 
synthesis of said biopolymer probes on said substrate, said plurality of different binding 
probes comprising a plurality of quality control probes, each quality control probe in said 
plurality comprising (i) the same predetermined binding sequence or (ii) a different 
predetermined binding sequence with the same binding specificity, the synthesis of said 
predetermined binding sequence in each said quality control probe having been initiated 
during said step-by-step synthesis at sequential cycles of synthesis. 

2. The array of claim 1 wherein the sequCTce of each said quality control probe 
of said plurality consists of said predetermined binding sequence. 

3. The array of claim 1 wherein said pluraUty of quality control probes 
comprise a second sequence consisting of a chemical structure contiguous with said 
predetermined binding sequence, wherein at least some of said quality control probes differ 
from other of said quality control probes in the length of said chemical structure. 

4. The array of claim 3 wherein said chemical stmcture is a sequence of 
number 0 to N monomers contiguous with said predetermined binding sequence, and where 
N is a whole number equal to or greater than 1. 

5. A method of determining if a positionally-addressable biopolymer array has 
a synthesis defect comprising the following st^s in the order stated: 

a) contacting the array of any of claims 1-2 with a sanople comprising a 
binding partner that binds said predetermined binding sequence; 

b) detecting or measuring binding between two or more of said quality 
control probes and said binding partner in the sample; and 

c) comparing binding of said two or more of said quality control probes, 
wherein if said binding is similar, the absence of a synthesis defect between said sequential 
cycles of synthesis of said array is indicated. 
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6. A method of determining if a positionaUy-addressable biopolymer array has 
a synthesis defect comprising the following steps in the order stated: 

a) contacting the array of claim 3 with a sample comprising a binding 
partner that binds said predetermined binding sequence; 

b) detecting or measuring binding between (i) two or more of said 
quality control probes that differ in the number of said monomers; and (ii) said binding 
partner in the sample; and 

c) comparing binding of said two or more of said quality control probes; 
wherein if said binding is similar, the absence of a synthesis defect between said sequential 
cycles of synthesis used to synthesize said two or more quality probes is indicated. 



7. The method of claim 5 wherein said comparing comprises determining the 
bmding ratio of two of said two or more quality control probes, wherein said binding ratio 
is the amount of binding of a first of said two quality control probes with said bmding 
partner, divided by the amount of binding of a second of said two quality control probes 
with said binding partner, and wherein said binding ratio between 0.5 and 2.0 indicates the 
absence of said synthesis defect. 

8. The method of claim 6 wherein said comparing comprises determining the 
binding ratio of two of said two or more quality control probes, wherein said binding ratio 
is the amount of binding of a first of said two quality control probes with said binding 
partner, divided by the amount of binding of a second of said two quality oonttol probes 
with said buiding partner, and wherem said binding ratio between 0.5 and 2.0 indicates the 
absence of said synthesis defect. 

9. The method of claim 5 or 6 fiirdier comprising before step (a) the step of 
synthesi2dng said array. 

1 0. The method of claim 5 or 6 wherein said sample comprises (i) total cellular 
RNA or mKNA &om one or more cells or a plurality of nucleic acids derived therefix>m, 
and (ii) said binding partner, wherein said binding partner is not expressed by said cells. 
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1 1 . The array of claim 2, 3, or 4 wherein said biopolymer probes are nucleic 

acids. 

12. The array of claim 1 1 wherein said predetermined binding sequence is in the 
range of 10-40 nucleotides in length. 

1 3 . The array of claim 1 1 wherein said biopolymer probes consist of a sequence 
in the range of 20-100 nucleotide. 

14. The array of claim 12 wherein said predetermined binding sequence is 25 
nucleotides in length. 

15. The array of claim 14 wherein said predetermined binding sequence is SEQ 
ID NO: 1 or a complement thereof. 

16. The array of claim 2, 3, or 4 wherein said biopolymer probes are proteins. 

17. The array of claim 16 wherein said proteins are antibodies. 

18. The array of claim 2 wherem said predetermined binding sequence of said 
quality control biopolymer probe is between 10-75% of the length of the length of the 
biopolymer probes on the array that are not said quality control probes. 

1 9. The array of claim 1 8 wherein said predetermined binding sequence consists 
of 25 monomers, and wherein said biopolymer probes on the array that are not said quality 
control probes consist of 60 monomers. 

20. The array of claim 4 wherein N is not greater than the mmiber of monomers 
in said biopolymer probes on the array that are not said quality control biopolymer probes 
minus the number of monomers in said predetermined binding sequence. 

2 1 . The array of claim 4 wherein N is greater than the number of monomers in 
said biopolymer probes on the array that are not said quality control biopolym^ probes 
minus the number of monomers in said predetermined binding sequence. 
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22. The array of claim 4 which comprises three of said quality control probes 
that differ in N. 

23. The array of claim 22 wherem N is 0, 20, and 35, respectively, for different 
quaUty control probes. 

5 24. A method of making a positionally-addressable array of a plurahty of 

different biopolymer probes comprising synthesizing said plurality of different biopolymer 
probes on a substrate ftom monomers usmg a step-by-step synthesis such that each of said 
different biopolymer probes is attached to said substrate at a different position on said 
substrate, wherein said plurality of different biopolymer probes comprise a plurality of 

10 quality control probes, each quaUty control probe in said plurality comprismg the same 
predetermined binding sequence, wherein the synthesis of said predetennmed binding 
sequence in each of said quality control probes is initiated during said step-by-step 
synthesis at sequential cycles of synthesis. 

25. The method of claim 24 wherein the sequence of each said quality control 
1 5 probe of said plurality coxisists of said predetermined binding sequence. 

26. The method of claim 24 wherein said plurality of quality control probes 
comprise a second sequence of number 0 to N monomers contiguous with said 
predetermined binding sequence, wherein at least some of said quahty control probes differ 
fiom other of said quality control probes in the number of said monomers, and where N is a 

20 whole number equal to or greater than 1 . 

27. The array of claim 1 wherein said plurality of quality control probes 
comprise 

(i) quality control probes whose sequence consists of said predetermined 

sequence; and 

25 (ii) quality control probes that comprise a second sequence of number 0 

to N monomers contiguous with said predetermined binding sequence, wherem at least 
some of said quality control probes differ &om other of said quality control probes in the 
number of said monomers, and where N is a whole number equal to or greater than 1 . 
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28. The array of claim 27 wherein said biopolymer probes are oligonucleotides, 
said predetermined sequence consists of 25 nucleotides, and said biopolymer probes that are 
not said quality control probes consist of 60 nucleotides. 

29. An oligonucleotide comprising a nucleotide sequence of SEQ ID NO: 1 or 
5 SEQ ID NO:2 or the complement thereof. 

30. A positionally addressable array comprising a substrate to which are attached 
a plurality of different biopolymer probes, said different biopolymer probes in said plurality 
bemg situated at different positions on said surface and being the product of a step-by-step 
addition of monomers to said biopolymer probes on said substrate, said plurality of different 

10 biopolymer probes comprising a plurality of quality control probes, each quality control 

probe in said plurality comprising at least one labeled monomer, the addition of said labeled 
monomer to said quality control probe having been initiated during said step-by-step 
synthesis at sequential cycles of synthesis. 

31. A method of determining if the positionally-addressable biopolymer array of 
15 claim 30 has a synthesis defect comprising comparing the signal from said at least one 

labeled monomer of two or more of said quality control probes, wherein if said signal is 
similar, the absence of a synthesis defect between said sequential cycles of synthesis of said 
array is indicated. 

32. The method of claim 3 1 wherein said comparing comprises determining the 
20 signal ratio of two of said two or more quality control probes, wherein said signal ratio is 

the amount of signal emitted from a first of said two quality control probes divided by the 
amount of signal emitted from a second of said two quality control probes, and wherein said 
signal ratio between 0.5 and 2.0 indicates the absence of said synthesis defect 

33. The array of claim 30 wherein said biopolymer probes are nucleic acids. 

25 34, The array of claim 33 wherein said biopolymer probes consist of a sequence 

in the range of 20-100 nucleotides. 

35. The array of claim 30 wherein said biopolymer probes are proteins. 
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36. The array of claim 35 wherein said proteins are antibodies. 

37. The method of any one of claims 5, 6, 3 1 wherein said synthesis defect is a 
nozzle failure. 

38. The method of claim 37 wherein said array comprises at least a portion of 
said quality control probes arranged in a periodicity of P and wherein said array is 
synthesized by step-by-step synthesis using an inkjet printhead with P nozzles, and where P 
is a whole number equal to or greater than 1 . 

39. The method of claim 38 wherein P equals 20. 

40. A method of detecting a nozzle failure using the positionally addressable 
array of claim 1 or 2 comprising the following steps in the order stated: 

a) contacting the array of any of claims 1 or 2 with a sample comprising 
a binding partner that binds said predetermined binding sequence, wherein at least a portion 
of said plurality of quality control probes is arranged in a periodicity of P and wherein said 
array is synthesized by step-by-step synthesis using an inkjet printhead with P nozzles, 
wherein P is a whole nxmiber equal to or greater than 1; 

b) detecting or measuring binding between two or more of said quality 
control probes and said binding partner in the sample; and 

c) comparing binding of said two or more of said quality control probes 
m a periodicity of P, wherein if said binding is similar, the absence of a nozzle defect is 
indicated. 

41 . A method of detecting a nozzle failure using the positionally addressable 
array of claim 30 comprising comparing the signal fix)m said at least one labeled monom^ 
of two or more of said quality control probes in a periodicity of P, wherein at least a portion 
of said plurality of quality control probes is arranged in a periodicity of P and wherein said 
array is synthesized by a step-by-step synthesis using an inkjet printhead with P no22:les, 
wherein if said signal is similar, the absence of a nozzle defect is indicated, and wherein P is 
a whole number equal to or greater than 1 . 
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